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VIRAL MATERIAL AND NUCLEOTIDE FRAGMENTS ASSOCIATED WITH 
MULTIPLE SCLEROSIS, FOR DIAGNOSTIC, PROPHYLACTIC AND 

THERAPEUTIC PURPOSES 

Multiple sclerosis (MS) is a demyelinating 
disease of the central nervous system (CHS) the cause of 
which remains as yet unknown. 

"Multiple sclerosis (MS) is the most common 
neurological disease of young adults with a prevalence in 
Europe and North America of between 20 and 200 per 
100,000. It is characterized clinically by a 
relapsing/remitting or chronic progressive course 
frequently leading to severe disability. Current knowledge 
suggests that MS is associated with autoimmunity, that 
15 genetic background has an important influence and that 
..infectious" agent(s) may be involved. Indeed, many 
viruses have been proposed as possible candidates but as 
yet, none of them has been shown to play an aetiological 

role . . ^ 

20 Ma ny studies have supported the hypothesis of a 

viral aetiology of the disease, but none of the known 
viruses tested has proved to be the causal agent 
review of the viruses sought for several years in MS has 
been compiled by E. Norrby (1) and R.T. Johnson (2). 
25 The discovery of pathogenic retroviruses m man 

(HTLVs and HIV.) was followed by great interest in their 
ability to impair the immune system and to provoke central 
nervous system inflammation and/or degeneration. In the 
case of HTLV-1, its association with a chronic 
30 inflammatory demyelinating disease in man 

extensive investigations to search for an «TLV1 like 
retrovirus in MS patients. However, despite initial 
claims, the presence of HTLV-1 or HTLV-like retroviruses 
was not confirmed. 
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Recently, a retrovirus different from the known 
human retroviruses has been isolated in patients suffering 

from MS (3 , 4 , and 5) . 

In 1989, the authors described the production of 

5 extracellular virions, associated with reverse 
transcriptase (RT) activity, by a culture of 
leptomeningeal cells (LM7) obtained from the cerebrospinal 
fluid of a patient with MS (3) . This was followed by 
similar findings in monocyte cultures from a series of MS 
10 patients (5) . Neither viral particles nor viral In- 
activity were found in control individuals. Furthermore, 
the authors were able to transfer the LM7 virus to non- 
infected leptomeningeal cells in vitro (26) . The molecular 
characterization of the »LM7» retrovirus was a 
15 prerequisite for further evaluation of its possible role 
in MS. Considerable difficulties arose from the absence of 
continuously productive retroviral cultures and from the 
low levels of expression in the few transient cultures. 
The strategy described here focused on RNA from 

20 extracellular virions, in order to avoid non-specific 
detection of cellular RNA and of endogenous elements from 
contaminating human DNA. A specific retroviral sequence 
associated with virions produced by cell cultures from 
several MS patients has been identified. The entire 

25 sequence of this novel retroviral genome is currently 
being obtained using RT-PCR on RNA from extracellular 
virions. The retrovirus previously called "LM7 virus" 
corresponds to an oncovirus and is now designated MSRV 
(Multiple sclerosis-associated Retrovirus) . 

30 Th e authors were also able to show that this 

retrovirus could be transmitted in vitro, that patients 
suffering from MS produced antibodies capable of 
recognizing proteins associated with the infection of 
leptomeningeal cells by this retrovirus, and that the 

35 expression of the latter could be strongly stimulated by 
the immediate-early genes of some herpesviruses (6) . 



All these results point to the role in MS of at 
least one unknown retrovirus or of a virus having reverse 
transcriptase activity which is detectable according to 
the method published by H. Perron (3) and qualified as 
5 "LM7-like RT" activity. The content of the publication 
identified by (3) is incorporated in the present 
description by reference. 

Recently, the Applicant's studies have enabled 
two continuous cell lines infected with natural isolates 

10 originating from two different patients suffering from MS 
to be obtained by a culture method as described in the 
document WO-A-93/ 20188 the content of which is incorpor- 
ated in the present description by reference. These two 
lines, derived from human choroid plexus cells, designated 

15 LM7PC and PLI-2, were deposited with the ECACC on 
22nd July 1992 and 8th January 1993, respectively, under 
numbers 92072201 and 93010817, in accordance with the 
provisions of the Budapest Treaty. Moreover, the viral 
isolates possessing LM7-like RT activity were also 

20 deposited with the ECACC under the overall designation of 
"strains". The "strain" or isolate harboured by the PLI-2 
line, designated POL-2, was deposited with the ECACC on 
22nd July 1992 under No. V92072202. The "strain" or 
isolate harboured by the LM7PC line, designated MS7PG, was 

25 deposited with the ECACC on 8th January 1993 under 

NO. V93010816. 

Starting from the cultures and isolates 
mentioned above, characterized by biological and 
morphological criteria, the next step was to endeavour to 
30 characterize the nucleic acid material associated with the 
viral particles produced in these cultures. 

The portions of the genome which have already 
been characterized have been used to develop tests for 
molecular detection of the viral genome and 
35 immunoserological tests, using the amino acid sequences 
encoded by the nucleotide sequences of the viral genome, 
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in order to detect the immune response directed against 
epitopes associated with the infection and/or viral 
expression. 

These tools have already enabled an association 
5 to be confirmed between MS and the expression of the 
sequences identified in the patents cited later. However, 
the viral system discovered by the Applicant is related to 
a complex retroviral system. In effect, the sequences to 
be found encapsidated in the extracellular viral particles 
10 produced by the different cultures of cells of patients 
suffering from MS show clearly that there is 
coencapsidation of retroviral genomes which are related 
but different from the "wild-type" retroviral genome which 
produces the infective viral particles. This phenomenon 
15 has been observed between replicative retroviruses and 
endogenous retroviruses belonging to the same f family, or 
even heterologous retroviruses. The notion of endogenous 
retroviruses is very important in the context of our 
discovery since, in the case of MSRV-l, it has been 
20 observed that endogenous retroviral sequences comprising 
sequences homologous to the MSRV-l genome exist in normal 
human DNA. The existence of endogenous retroviral elements 
(ERV) related to MSRV-l by all or part of their genome 
explains the fact that the expression of the MSRV-l 
25 retrovirus in human cells is able to interact with closely 
related endogenous sequences. These interactions are to be 
found in the case of pathogenic and/or infectious 
endogenous retroviruses (for example some ecotropic 
strains of the murine leukaemia virus) , and in the case of 
30 exogenous retroviruses whose nucleotide sequence may be 
found partially or wholly, in the form of ERVs, in the 
host animal's genome (e.g. mouse exogenous mammary tumor 
virus transmitted via the milk). These interactions 
consist mainly of (i) a trans-activation or coactivation 
35 of ERVs by the replicative retrovirus (ii) and 
-illegitimate" encapsidation of RNAs related to ERVS, or 
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of ERVs - or even of cellular RNAs - simply possessing 
compatible encapsidation sequences, in the retroviral 
particles produced by the expression of the replicative 
strain, which are sometimes transmissible and sometimes 
5 with a pathogenicity of their own, and (iii) more or less 
substantial recombinations between the coencapsidated 
genomes, in particular in the phases of reverse 
transcription, which lead to the formation of hybrid 
genomes, which are sometimes transmissible and sometimes 

10 with a pathogenicity of their own. 

Thus, (i) different sequences related to MSRV-1 
have been found in the purified viral particles; (ii) 
molecular analysis of the different regions of the MSRV-1 
retroviral genome should be carried out by systematically 

15 analyzing the coencapsidated, interfering and/or 
recombined sequences which are generated by the infection 
and/or expression of MSRV-1; furthermore, some clones may 
have defective sequence portions produced by the 
retroviral replication and template errors and/ or errors 

20 of transcription of the reverse transcriptase; (iii) the 
families of sequences related to the same retroviral 
genomic region provide the means for an overall diagnostic 
detection which may be optimized by the identification of 
invariable regions among the clones expressed, and by the 

25 identification of reading frames responsible for the 
production of antigenic and/or pathogenic polypeptides 
which may be produced only by a portion, or even by just 
one, of the clones expressed, and, under these conditions, 
the systematic analysis of the clones expressed in the 

3 0 region of a given gene enables the frequency of variation 
and/or of recombination of the MSRV-1 genome in this 
region to be evaluated and the optimal sequences for the 
applications, in particular diagnostic applications, to be 
defined; (iv) the pathology caused by a retrovirus such as 

35 MSRV-1 may be a direct effect of its expression and of the 
proteins or peptides produced as a result thereof, but 
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also an effect of the activation, the encapsidation or the 
recombination of related or heterologous genomes and of 
the proteins or peptides produced as a result thereof; 
thus, these genomes associated with the expression of 
5 and/or infection by MSRV-1 are an integral part of the 
potential pathogenicity of this virus, and hence 
constitute means of diagnostic detection and special 
therapeutic targets. Similarly, any agent associated with 
or cofactor of these interactions responsible for the 

10 pathogenesis in question, such as MSRV-2 or the gliotoxic 
factor which are described in the patent application 
published under No, FR-2 , 716, 198 , may participate in the 
development of an overall and very effective strategy for 
the diagnosis, prognosis, therapeutic monitoring and/or 

15 integrated therapy of MS in particular, but also of any 
other disease associated with the same agents. 

In this context, a parallel discovery has been 
made in another autoimmune disease, rheumatoid arthritis 
(RA) , which has been described in the French Patent 

20 Application filed under No. 95/02960. This discovery shows 
that, by applying methodological approaches similar to the 
ones which were used in the Applicant's work on MS, it was 
possible to identify a retrovirus expressed in RA which 
shares the sequences described for MSRV-1 in MS, and also 

25 the coexistence of an associated MSRV-2 sequence also 
described in MS. As regards MSRV-1, the sequences detected 
in common in MS and RA relate to the pol and gag genes. In 
the current state of knowledge, it is possible to 
associate the gag and pol sequences described with the 

3 0 MSRV-1 strains expressed in these two diseases. 

The present patent application relates to 
various results which are additional to those already 
protected by the following French Patent Applications: 
- No. 92/04322 of 03.04.1992, published under 

35 No. 2,689,519; 
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- NO. 92/13447 of 
No. 2,689,521; 

- NO. 92/13443 of 
NO. 2,689,520; 

- No. 94/01529 Of 
No. 2,715,936; 

- No. 94/01531 of 
NO. 2,715,939; 

- NO. 94/01530 of 
NO. 2,715,936; 

- No. 94/01532 of 
No. 2,715,937; 

- No. 94/14322 of 



30 



03.11.1992, 
03.11.1992, 
04.02.1994, 
04.02.1994, 
04.02.1994, 
04.02.1994, 
24.11.1994, 



of 23.12.1994; 



published 
published 
published 
published 
published 
published 
published 
published 



under 



under 



under 



under 



under 



under 



under 



under 
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No. 2,727,428; 

- and No. 94/15810 
No. 2,728,585. 

The present invention relates, in the first 
place, to a viral material, in the isolated or purified 
state, which may be recognized or characterized in 
different ways: 

- its genome comprises a nucleotide sequence chosen from 
the group including the sequences SEQ ID NO: 46, SEQ ID 
NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:56, SEQ ID 
NO:58, SEQ ID NO:59, SEQ ID N0:60, SEQ ID NO:61, SEQ ID 
NO: 89, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 7 0% homology with the said 
sequences SEQ ID NO:46, SEQ ID NO:51, SEQ ID NO:52, SEQ ID 
NO:53, SEQ ID NO:56, SEQ ID N0:58, SEQ ID NO:59, SEQ ID 
NO: 60 SEQ ID NO: 61, SEQ ID NO: 89, respectively, and their 
complementary sequences; 

- the region of its genome comprising the env and pol 
genes and a portion of the gag gene, excluding the 
subregion having a sequence identical or equivalent to 
SEQ ID N0:l, codes for any polypeptide displaying, for any 
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contiguous succession of at least 30 amino acids, at least 
50% and preferably at least 70% homology with a peptide 
sequence encoded by any nucleotide sequence chosen from 
the group including SEQ ID NO:46, SEQ ID N0:51, SEQ ID 
5 NO:52, SEQ ID NO:53, SEQ ID NO:56, SEQ ID 110:58, SEQ ID 
NO: 59, SEQ ID NO: 60 SEQ ID NO: 61 SEQ ID NO: 89 and their 
complementary sequences ; 

- the pol gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 57 or SEQ 

10 ID NO: 93, excluding SEQ ID N0:1. 

- the gag gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 88. 

As indicated above, according to the present 
invention, the viral material as defined above is 

15 associated with MS. And as defined by reference to the pol 
or gag gene of MSRV-1, and more especially to the 
sequences SEQ ID NOS 51, 56, 57, 59, 60, 61, 88, 89, 93, 
169, 170, 171, 172, 176, 177, 178 and 179, this viral 
material is associated with RA. 

20 T he present invention also relates to a nucleic 

material, in the isolated or purified state, having at 
least one of the following definitions : 

- a nucleic material comprising a nucleotide sequence 
selected from the group including sequences SEQ ID NO: 93, 

25 SEQ ID N0:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO:172, SEQ ID NO:176, SEQ ID NO:177, 

SEQ ID NO:178, SEQ ID NO:179, their complementary 
sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 

30 contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO:172, SEQ ID NO:176, SEQ ID NO:177, 

SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 

35 sequences, excluding HSERV-9 (or ERV-9) ; advantageously, 
the nucleotide sequence of said nucleic material is 
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selected from the group including sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO:172, SEQ ID NO:176, SEQ ID NO:177, 

SEQ ID NO: 178, SEQ ID NO: 179, . their complementary 
5 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequences SEQ ID NO:93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO: 171, 
10 SEQ ID NO: 172, SEQIDNO:176, SEQIDNO:177, 

SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
coding for any polypeptide displaying, for any contiguous 

15 succession of at least 30 amino acids, at least 50%, 
preferably at least 60 %, and most preferably at least 70% 
homology with a peptide sequence encoded by any nucleotide 
sequence selected from the group including SEQ ID NO: 93, 
SEQ ID N0:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 

20 SEQ ID NO: 172, SEQIDNO:176, SEQIDNO:177, 

SEQ ID NO: 178, SEQ ID NO: 179 and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
of retroviral type, comprising a nucleotide sequence 

25 identical or similar to at least part of the pol gene of 
an isolated retrovirus associated with multiple sclerosis 
or rheumatoid arthritis; advantageously, said nucleotide 
sequence is 80 % similar to said at least part of the gene 
pol; 

30 - a nucleic material comprising a nucleotide sequence 
identical or similar to at least part of the pol gen of an 
isolated virus encoding a reverse transcriptase having a 
enzymatic site comprised between the amino acid domains 
LPQG-YXDD, having a phylogenic distance with HSERV-9 of 

35 0.063 ± 0.1, and preferably 0.063 ± 0.05; the phylogenic 
distances are calculated on the basis of a reference 
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sequence according to UPGM tree option of the Geneworks™ 
Software' ( INTELLI GENET ICS ) ; 

By enzymatic site, we understand the amino acids domain (s) 
conferring the specific activity of a given enzyme. 
5 The present invention also relates to different 

nucleotide fragments each comprising a nucleotide sequence 
chosen from the group including: 

(a) all the genomic sequences, partial and total, of the 
pol gene of the MSRV-1 virus, except for the total 

10 sequence of the nucleotide fragment defined by 
SEQ ID N0:1; 

(b) all the genomic sequences, partial and total, of the 
env gene of MSRV-1; 

(c) all the partial genomic sequences of the gag gene of 
15 MSRV-1 ; 

(d) all the genomic sequences overlapping the pol gene and 
the env gene of the MSRV-1 virus, and overlapping the pol 
gene and the gag gene; 

(e) all the sequences, partial and total, of a clone 
20 chosen from the group including the clones FBd3 

(SEQ ID NO:46), t pol (SEQ ID N0:51), JLBCl 

(SEQ ID NO:52), JLBC2 (SEQ ID NO:53) and GM3 

(SEQ ID NO:56), FBdl3 (SEQ ID NO:58), LB19 (SEQ ID NO:59), 
LTRGAG12 (SEQ ID NO: 60), FP6 (SEQ ID NO: 61), G+E+A 
25 (SEQ ID N0:89) , excluding any nucleotide sequence 
identical to or lying within the sequence defined by 
SEQ ID NO:l; 

(f) sequences complementary to the said genomic sequences; 

(g) sequences equivalent to the said sequences (a) to (e) , 
30 in particular nucleotide sequences displaying, for any 

succession of 100 contiguous monomers, at least 50% and 
preferably at least 70% homology with the said sequences 
(a) to (d) , 

provided that this nucleotide fragment does not comprise 
35 or consist of the sequence ERV-9 as described in LA MANTIA 
et al. (18). 
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The term genomic sequences, partial or total, 
includes all sequences associated by coencapsidation or by 
coexpression, or recombined sequences. 

Preferably, such a fragment comprises: 

- either a nucleotide sequence identical to a partial or 
total genomic sequence of the pol gene of the MSRV-1 
virus, except for the total sequence of the nucleotide 
fragment defined by SEQ ID NO:l, or identical to any 
sequence equivalent to the said partial or total genomic 
sequence, in particular one which is homologous to the 
latter; 

- or a nucleotide sequence identical to a partial or total 
genomic sequence of the env gene of the MSRV-1 virus, or 
identical to any sequence complementary to the said 

15 nucleotide sequence, or identical to any sequence 

equivalent to the said nucleotide sequence, in particular 

one which is homologous to the latter - 

in particular, the invention relates to a 

nucleotide fragment comprising a coding nucleotide 
20 sequence which is partially or totally identical to a 

nucleotide sequence chosen from the group including: 

- the nucleotide sequence defined by SEQ ID NO: 40, SEQ ID 
NO: 62 or SEQ ID NO: 89; 

- sequences complementary to SEQ ID NO:40, SEQ ID N0:62 or 

25 SEQ ID NO: 89; 

- sequences equivalent, and in particular homologous to 

SEQ ID NO: 40, SEQ ID NO: 62 or SEQ ID NO: 89; 

- sequences coding for all or part of the peptide sequence 
defined by SEQ ID NO:39, SEQ ID NO:63 or SEQ ID NO:90; 

- sequences coding for all or part of a peptide sequence 
equivalent, in particular homologous to SEQ ID NO: 39, SEQ 
ID NO: 63 or SEQ ID NO: 90, which is capable of being 
recognized by sera of patients infected with the MSRV-1 
virus, or in whom the MSRV-1 virus has been reactivated. 
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The invention also relates to a nucleotide 
fragment (called fragment I) having at least one of the 
following definitions : 

- a nucleotide fragment comprising a nucleotide sequence 
5 selected from the group including SEQ ID NO: 93 , 
SEQ ID NO:94, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID NO: 171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 
sequences and their equivalent sequences, in particular 

10 nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 50% ; and preferably at least 
60% homology with said sequences and their complementary 
sequences, said group excluding SEQ ID NO:l, 
said nucleotide fragment not comprising nor consisting of 

15 the sequence HSERV-9 (or ERV-9) ; preferably the nucleotide 
sequence of said fragment is selected from the group 
including SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 169, 
SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

SEQ ID NO: 176, SEQ ID NO: 177, SEQ ID NO: 178, 

20 SEQ ID NO: 179, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 70% and preferably at least 80% homology with 
said sequences and their complementary sequences; 

25 - a nucleotide fragment comprising a coding nucleotide 
sequence which is partially or totally identical to a 
nucleotide sequence selected from the group including : 

SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 169, 

SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

30 SEQ ID NO: 176, SEQ ID NO: 177, SEQ ID NO: 178, 

SEQ ID NO: 179 ; their complementary sequences ; their 
equivalent sequences, in particular homologous to 
SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, 
SEQ ID NO: 171, SEQ ID NO: 172, SEQ ID NO: 176, 

35 SEQ ID NO: 177, SEQ ID NO: 178, SEQ ID NO: 179; 
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sequences encoding all or parts of the peptide 
sequence defined by SEQ ID NO: 95, SEQ ID NO: 173, 
SEQ ID NO:174, SEQ ID NO:175, SEQ ID NO:180, 

SEQ ID NO: 181, SEQ ID NO: 182; 
5 sequences encoding all or parts of a peptide 

sequence equivalent, in particular homologous to 
SEQ ID NO:95, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, 
SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 182, which is 
capable of being recognized by sera of patients infected 
10 with the MSRV-1 virus, or in whom the MSRV-1 virus has 
been reactivated. 

The invention also relates to any nucleic acid 
probe for the detection of virus associated with MS and/or 
rheumatoid arthritis (RA) , which is capable of hybridizing 
15 specifically with any fragment such as is defined above, 
belonging or lying within the genome of the said 
pathogenic agent. It relates, in addition, to any nucleic 
acid probe for detection of a pathogenic and/or infective 
agent associated with RA, which is capable of hybridizing 
20 specifically with any fragment as defined above by 
reference to the pol and gag genes, and especially with 
respect to the sequences SEQ ID NOS 40, 51, 56, 59, 60, 
61, 62, 89 and SEQ ID NOS 39, 63 and 90. 

The invention also relates to a primer for the 
25 amplification by polymerization of an RNA or a DNA of a 
viral material, associated with MS and/or RA, comprising a 
nucleotide sequence identical or equivalent to at least 
one portion of the nucleotide sequence of any fragment 
such as is defined above, in particular a nucleotide 
30 sequence displaying, for any succession of at least 10 
contiguous monomers, preferably 15 contiguous monomers, 
more preferably 18 contiguous monomers and even most 
preferably 20 contiguous monomers, at least 70% homology 
with at least the said portion of the said fragment. 
35 Preferably, the nucleotide sequence of such a primer is 
identical to any one of the sequences selected from the 
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group including SEQ ID NO: 47 to SEQ ID NO: 50, 

SEQ ID NO:55, SEQ ID NO:64, SEQ ID NO:86, SEQ ID NO:99 to 
SEQ ID NO: 111, SEQ ID NO: 183, SEQ ID NO: 184, 

SEQ ID NO: 185, SEQ ID NO: 186 • 
5 Generally speaking the invention also 

encompasses any RNA or DNA, and in particular replication 
vector, comprising a genomic fragment of the viral 
material such as is defined above, or a nucleotide 
fragment such as is defined above. 

10 The invention also relates to the different 

peptides encoded by any open reading frame belonging to a 
nucleotide fragment such as is defined above, in 
particular any polypeptide, for example any oligopeptide 
forming or comprising an antigenic determinant recognized 

15 by sera of patients infected with the MSRV-1 virus and/or 
in whom the MSRV-1 virus has been reactivated. Preferably, 
this polypeptide is antigenic, and is encoded by the open 
reading frame beginning, in the 5 , -3 l direction, at 
nucleotide 181 and ending at nucleotide 330 of 

20 SEQ ID NO:l. 

The invention also encompasses the following 

polypeptides : 

a) 

- a polypeptide encoded by any open reading frame 
25 belonging to a nucleotide fragment, fragment I, as defined 

above ; 

- a polypeptide, characterized in that the open reading 
frame encoding it, is comprised, in the 5'-3' direction, 
between nucleotide 18 and nucleotide 2304 of SEQ ID NO: 93; 

3 0 - a polypeptide, having a peptide sequence comprising a 
sequence partially or totally identical to SEQ ID NO: 95; 
b) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 

35 equivalent to SEQ ID NO: 96; in particular said polypeptide 
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exhibits an enzymatic activity consisting of proteolytic 
activity; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

5 5'-3« direction, at nucleotide 18 and ends at nucleotide 
340 Of SEQ ID NO: 93; 

- a polypeptide having an inhibitory activity on the 
proteolytic activity of a polypeptide as defined according 
to b) ; 

10 c) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 
equivalent to SEQ ID NO: 97; in particular said polypeptide 
exhibits a reverse transcriptase activity; 

15 - a polypeptide having a peptide sequence which comprises 
a sequence identical or equivalent to SEQ ID NO: 98; in 
particular said polypeptide exhibits a ribonuclease 
activity ; 

- a polypeptide, recombinant or synthetic, characterized 
20 in that the open reading frame encoding it begins, in the 

5«-3- direction, at nucleotide 341 and ends at nucleotide 
2304 of SEQ ID NO: 93; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

25 5 '-3' direction, at nucleotide 1858 and ends at nucleotide 
2304 of SEQ ID NO: 93. 

- a polypeptide having an inhibitory activity on the 
reverse transcriptase activity of a polypeptide as defined 
according to c) or on the ribonuclease H activity of a 

30 polypeptide as defined according to c) . 

In particular, the invention relates to an 
antigenic polypeptide recognized by the sera of patients 
infected with the MSRV-1 virus, and/or in whom the MSRV-1 
virus has been reactivated, whose peptide sequence is 

35 partially or totally identical or is equivalent to the 
sequence defined by SEQ ID NO:39, SEQ ID N0:63, 
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SEQ ID NO:87, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, 
SEQ ID NO: 98, SEQ ID NO: 173 , SEQ ID Kb: 174, SEQ ID NO: 175, 
SEQ ID NO: 180, SEQ ID NO: 181 and SEQ ID NO: 182; such a 
sequence is identical, for example, to any sequence 
5 selected from the group including the sequences 
SEQ ID NO: 41 to SEQ ID NO: 44, SEQ ID NO: 63 and 

SEQ ID NO: 87. 

The present invention also proposes mono- or 
polyclonal antibodies directed against the MSRV-1 virus, 
10 which are obtained by the immunological reaction of a 
human or animal body or cells to an immunogenic agent 
consisting of an antigenic polypeptide such as is defined 
above . 

The invention next relates to: 

15 - reagents for detection of the MSRV- virus, or of an 
exposure to the latter, comprising, at least one reactive 
substance selected from the group consisting of a probe of 
the present invention, a polypeptide, in particular an 
antigenic peptide, such as is defined above, or an anti- 

20 ligand, in particular an antibody to the said polypeptide; 
- all diagnostic, prophylactic or therapeutic compositions 
comprising one or more peptides, in particular antigenic 
peptides, such as are defined above, or one or more anti- 
ligands, in particular antibodies to the peptides, 

25 discussed above; such a composition is preferably, and by 
way of example, a vaccine composition. 

The invention also relates to any diagnostic, 
prophylactic or therapeutic composition, in particular for 
inhibiting the expression of at least one virus associated 

3 0 with MS or RA, and/or the enzymatic activities of the 
proteins of said virus, comprising a nucleotide fragment 
such as is defined above or a polynucleotide, in 
particular oligonucleotide, whose sequence is partially 
identical to that of the said fragment, except for that of 

35 the fragment having the nucleotide sequence SEQ ID NO:l. 
Likewise, it relates to any diagnostic, prophylactic or 
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therapeutic composition, in particular for inhibiting the 
expression of at least one pathogenic and/or infective 
agent associated with RA, comprising a nucleotide fragment 
such as is defined above by reference to the pol and gag 
5 genes, and especially with respect to the sequences 
SEQ ID NOS 40, 51, 56, 59, 60, 61, 62 and 89, 

According to the invention, these same fragments 
or polynucleotides, in particular oligonucleotides, may 
participate in all suitable compositions for detecting, 

10 according to any suitable process or method, a patho- 
logical and/or infective agent associated with MS and with 
RA, respectively, in a biological sample. In such a 
process, an RNA and/or a DNA presumed to belong or 
originating from the said pathological and/or infective 

15 agent, and/ or their complementary RNA and/ or DNA, is/are 
brought into contact with such a composition. 

The present invention also relates to any 
process for detecting the presence or exposure to such a 
pathological and/or infective agent, in a biological 

20 sample, by bringing this sample into contact with a 
peptide, in particular an antigenic peptide such as is 
defined above, or an anti-ligand, in particular an anti- 
body to this peptide, such as is defined above. 

In practice, and for example, a device for 

25 detection of the MSRV-1 virus comprises a reagent such as 
is defined above, supported by a solid support which is 
immunologically compatible with the reagent, and a means 
for bringing the biological sample, for example a sample 
of blood or of cerebrospinal fluid, likely to contain 

30 anti-MSRV-1 antibodies, into contact with this reagent 
under conditions permitting a possible immunological 
reaction, the foregoing items being accompanied by means 
for detecting the immune complex formed with this reagent. 

Lastly, the invention also relates to the detec- 

35 tion of anti-MSRV-1 antibodies in a biological sample, for 
example a sample of blood or of cerebrospinal fluid, 
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according to which this sample is brought into contact 
with a reagent such as is defined above, consisting of an 
antibody, under conditions permitting their possible 
immunological reaction, and the presence of the immune 
5 complex thereby formed with the reagent is then detected. 

Before describing the invention in detail, 
different terms used in the description and the claims are 
now defined: 

- strain or isolate is understood to mean any 
10 infective and/or pathogenic biological fraction contain- 
ing for example, viruses and/or bacteria and/or para- 
sites, generating pathogenic and/or antigenic power, 
harboured by a culture or a living host; as an example, a 
viral strain according to the above definition can contain 

15 a coinfective agent, for example a pathogenic protist, 

- the term "MSRV" used in the present 
description denotes any pathogenic and/or infective agent 
associated with MS, in particular a viral species, the 
attenuated strains of the said viral species or the 

20 defective-interfering particles or particles containing 
coencapsidated genomes, or alternatively genomes 
recombined with a portion of the MSRV-1 genome, derived 
from this species. Viruses, and especially viruses 
containing RNA, are known to have a variability resulting, 

25 in particular, from relatively high rates of spontaneous 
mutation (7), which will be borne in mind below for 
defining the notion of equivalence, 

- human virus is understood to mean a virus 
capable of infecting, or of being harboured by human 

30 beings, 

- in view of all the natural or induced vari- 
ations and/or recombination which may be encountered when 
implementing the present invention, the subjects of the 
latter, defined above and in the claims, have been 

35 expressed including the equivalents or derivatives of the 
different biological materials defined below, in 
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particular of the homologous nucleotide or peptide 
sequences , 

- the variant of a virus or of a pathogenic 
and/or infective agent according to the invention 
5 comprises at least one antigen recognized by at least one 
antibody directed against at least one corresponding 
antigen of the said virus and/or said pathogenic and/or 
infective agent, and/or a genome any part of which is 
detected by at least one hybridization probe and/or at 

10 least one nucleotide amplification primer specific for the 
said virus and/ or pathogenic and/or infective agent, such 
as, for example, for the MSRV-l virus, the primers and 
probes having a nucleotide sequence chosen from 
SEQ ID NO: 20 to SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 16 

15 to SEQ ID NO: 19, SEQ ID NO: 31 to SEQ ID NO: 33, 

SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, 
SEQ ID NO: 50, SEQ ID NO: 45 and their complementary 
sequences, under particular hybridization conditions well 
known to a person skilled in the art, 

20 - according to the invention, a nucleotide 

fragment or an oligonucleotide or polynucleotide is an 
arrangement of monomers, or a biopolymer, characterized by 
the informational sequence of the natural nucleic acids, 
which is capable of hybridizing with any other nucleotide 

25 fragment under predetermined conditions, it being possible 
for the arrangement to contain monomers of different 
chemical structures and to be obtained from a molecule of 
natural nucleic acid and/ or by genetic recombination 
and/or by chemical synthesis; a nucleotide fragment may be 

30 identical to a genomic fragment of the MSRV-l virus 
discussed in the present invention, in particular a gene 
of this virus, for example pol or env in the case of the 
said virus, 

- thus, a monomer can be a natural nucleotide of 
35 nucleic acid whose constituent elements are a sugar, a 
phosphate group and a nitrogenous base; in RNA the sugar 
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is ribose, in DNA the sugar is 2-deoxyribose; depending on 
whether the nucleic acid is DNA or RNA, the nitrogenous 
base is chosen from adenine, guanine, uracil, cytosine and 
thymine; or the nucleotide can be modified in at least one 
5 of the three constituent elements; as an example, the 
modification can occur in the bases, generating modified 
bases such as inosine, 5-methyldeoxycytidine, 
deoxyuridine, 5- (dimethylamino) deoxyuridine, 2 , 6- 

diaminopurine, 5-bromodeoxyuridine and any other modified 

10 base promoting hybridization; in the sugar, the 
modification can consist of the replacement of at least 
one deoxyribose by a polyamide (8) , and in the phosphate 
group, the modification can consist of its replacement by 
esters chosen, in particular, from diphosphate, alkyl- and 

15 arylphosphonate and phosphorothioate esters, 

- "informational sequence" is understood to mean 
any ordered succession of monomers whose chemical nature 
and order in a reference direction constitute or otherwise 
an item of functional information of the same quality as 

20 that of the natural nucleic acids, 

- hybridization is understood to mean the 
process during which, under suitable working conditions, 
two nucleotide fragments having sufficiently complementary 
sequences pair to form a complex structure, in particular 

25 double or triple, preferably in the form of a helix, 

- a probe comprises a nucleotide fragment syn- 
thesized chemically or obtained by digestion or enzymatic 
cleavage of a longer nucleotide fragment, comprising at 
least six monomers, advantageously from 10 to 1000 mono- 

30 mers, preferably 10 to 30 monomers and more preferably 18 
to 30, and possessing a specificity of hybridization under 
particular conditions; preferably, a probe possessing 
fewer than 10 monomers, but preferably fewer than 15 
monomers is not used alone, but is used in the presence of 

35 other probes of equally short size or otherwise; under 
certain special conditions, it may be useful to use probes 
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of size greater than 100 monomers; a probe may be used, in 
particular, for diagnostic purposes, such molecules being, 
for example, capture and/or detection probes, 

- the capture probe may be immobilized on a 
5 solid support by any suitable means, that is to say 

directly or indirectly, for example by covalent bonding or 
passive adsorption, 

- the detection probe may be labelled by means 
of a label chosen, in particular, from radioactive 

10 isotopes, enzymes chosen, in particular, from peroxidase 
and alkaline phosphatase and those capable of hydrolysing 
a chromogenic, fluorogenic or luminescent substrate, 
chromophoric chemical compounds, chromogenic, fluorogenic 
or luminescent compounds, nucleotide base analogues and 

15 biotin, 

- the probes used for diagnostic purposes of the 
invention may be employed in all known hybridization 
techniques, and in particular the techniques termed "DOT- 
BLOT" (9), "SOUTHERN BLOT" (10), "NORTHERN BLOT", which is 

20 a technique identical to the "SOUTHERN BLOT" technique but 
which uses RNA as target, and the SANDWICH technique (11); 
advantageously, the SANDWICH technique is used in the 
present invention, comprising a specific capture probe 
and/or a specific detection probe, on the understanding 

25 that the capture probe and the detection probe must 
possess an at least partially different ^nucleotide 
sequence, 

- any probe according to the present invention 
can hybridize in vivo or in vitro with RNA and/or with DNA 

30 in order to block the phenomena of replication, in 

particular translation and/or transcription, and/or to 

degrade the said DNA and/ or RNA, 

-r a primer is a probe comprising at least six 

monomers, and advantageously from 10 to 30 monomers, and 
35 preferably from 18 to 25 monomers, possessing a 

specificity of hybridization under particular conditions 
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for the initiation of an enzymatic polymerization, for 
example in an amplification technique such as PCR 
(polymerase chain reaction) , in an elongation process such 
as sequencing, in a method of reverse transcription or the 
5 like, 

- two nucleotide or peptide sequences are termed 
equivalent or derived with respect to one another, or with 
respect to a reference sequence, if functionally the 
corresponding biopolymers can perform substantially the 

10 same role, without being identical, as regards the 
application or use in question, or in the technique in 
which they participate; two sequences are, in particular, 
equivalent if they are obtained as a result of natural 
variability, in particular spontaneous mutation of the 

15 species from which they have been identified, or induced 
variability, as are two homologous sequences, homology 

being defined below, 

- "variability" is understood to mean any 
spontaneous or induced modification of a sequence, in par- 

20 ticular by substitution and/or insertion and/or deletion 
of nucleotides and/or of nucleotide fragments, and/or 
extension and/or shortening of the sequence at one or both 
ends; an unnatural variability can result from the genetic 
engineering techniques used, for example the choice of 

25 synthesis primers, degenerate or otherwise, selected for 
amplifying a nucleic acid; this variability can manifest 
itself in modifications of any starting sequence, 
considered as reference, and capable of being expressed by 
a degree of homology relative to the said reference 

30 sequence, 

- homology characterizes the degree of identity 
of two nucleotide or peptide fragments compared; it is 
measured by the percentage identity which is determined, 
in particular, by direct comparison of nucleotide or 

35 peptide sequences, relative to reference nucleotide or 
peptide sequences, 
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- this percentage identity has been specifically 
determined for the nucleotide fragments, clones in 
particular, dealt with in the present invention, which are 
homologous to the fragments identified, for the MSRV-1 

5 virus, by SEQ ID N0:1 to N0:9, SEQ ID N0:46, SEQ ID NO:51 
to SEQ ID NO:53, SEQ ID N0:40, SEQ ID NO:56, SEQ ID NO:57 
and SEQ ID NO: 93, as well as for the probes and primers 
homologous to the probes and primers identified by SEQ ID 
NO:20 to SEQ ID NO:24, SEQ ID NO:26, SEQ ID N0:16 to SEQ 

10 ID N0:19, SEQ ID NO:31 to SEQ ID NO:33, SEQ ID NO:45, SEQ 
ID NO:47, SEQ ID N0:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID 
NO: 55, SEQ ID NO: 40, SEQ ID NO: 56, SEQ ID NO: 57 and SEQ ID 
NO:99 to SEQ ID NO: 111; as an example, the smallest 
percentage identity observed between the different general 

15 consensus sequences of nucleic acids obtained from 
fragments of MSRV-1 viral RNA, originating from the LM7PC 
and PLI-2 lines according to a protocol detailed later, is 
67% in the region described in Figure 1, 

- any nucleotide fragment is termed equivalent 
20 or derived from a reference fragment if it possesses a 

nucleotide sequence equivalent to the sequence of the 
reference fragment; according to the above definition, the 
following in particular are equivalent to a reference 
nucleotide fragment: 
25 a) any fragment capable of hybridizing at least 

partially with the complement of the reference fragment, 

b) any fragment whose alignment with the refer- 
ence fragment results in the demonstration of a larger 
number of identical contiguous bases than with any other 

30 fragment originating from another taxonomic group, 

c) any fragment resulting, or capable of result- 
ing, from the natural variability of the species from 
which it is obtained, 

d) any fragment capable of resulting from the 
35 genetic engineering techniques applied to the reference 

fragment, 
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e) any fragment containing . at least eight 
contiguous nucleotides encoding a peptide which is 
homologous or identical to the peptide encoded by the 
reference fragment, 
5 f) any fragment which is different from the 

reference fragment by insertion, deletion or substitution 
of at least one monomer, or extension or shortening at one 
or both of its ends; for example, any fragment 
corresponding to the reference fragment flanked at one or 
10 both of its ends by a nucleotide sequence not coding for a 
polypeptide, 

- polypeptide is understood to mean, in particu- 
lar, any peptide of at least two amino acids, in particu- 
lar an oligopeptide, or protein, and for example an 

15 enzyme, extracted, separated or substantially isolated or 
synthesized through human intervention, in particular 
those obtained by chemical synthesis or by expression in a 
recombinant organism, 

- polypeptide partially encoded by a nucleotide 
20 fragment is understood to mean a polypeptide possessing at 

least three amino acids encoded by at least nine 
contiguous monomers lying within the said nucleotide 
fragment, 

- an amino acid is termed analogous to another 
25 amino acid when their respective physicochemical prop- 
erties, such as polarity, hydrophobicity and/or basicity 
and/or acidity and/or neutrality are substantially the 
same; thus, a leucine is analogous to an isoleucine. 

- any polypeptide is termed equivalent or 
30 derived from a reference polypeptide if the polypeptides 

compared have substantially the same properties, and in 
particular the same antigenic, immunological, 
enzymological and/or molecular recognition properties; the 
following in particular are equivalent to a reference 
35 polypeptide: 
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in 



a) any polypeptide possessing a sequence 
which at least one amino acid has been replaced by an 

analogous amino acid, 

b) any polypeptide having an equivalent peptide 
sequence, obtained by natural or induced variation of the 
said reference polypeptide and/or of the nucleotide 
fragment coding for the said polypeptide, # 

c) a mimotope of the said reference polypeptide, 

d) any polypeptide in whose sequence one or more 
amino acids of the L series are replaced by an amino acid 
of the D series, and vice versa, 

e) any polypeptide into whose sequence a modifi- 
cation of the side chains of the amino acids has been 
introduced, such as, for example, an acetylation of the 
amine functions, a carboxylation of the thiol functions, 
an esterification of the carboxyl functions, 

f) any polypeptide in whose sequence one or more 
peptide bonds have been modified, such as, for example, 
carba, retro, inverse, retro-inverso, reduced and methy- 

20 lenoxy bonds, 

(g) any polypeptide at least one antigen of 
which is recognized by an antibody directed against a 

reference polypeptide, 

- the percentage identity characterizing the 
homology of two peptide fragments compared is, according 
to the present invention, at least 50% and preferably at 
least 70%. 

in view of the fact that a virus possessing 
reverse transcriptase enzymatic activity may be geneti- 
cally characterized equally well in RNA and in DNA form, 
both the viral DNA and RNA will be referred to for 
characterizing the sequences relating to a virus possess- 
ing such reverse transcriptase activity, termed MSRV-1 
according to the present description. 

The expressions of order used in the present 
description and the claims, such as "first nucleotide 
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sequence" , are not adopted so as to express a particular 
order, but so as to define the invention more clearly. 

Detection of a substance or agent is understood 
below to mean both an identification and a quantification, 
5 or a separation or isolation, of the said substance or 
said agent. 

A better understanding of the invention will be 
gained on reading the detailed description which follows, 
prepared with reference to the attached figures, in which: 

10 - Figure 1 shows general consensus sequences of 

nucleic acids of the MSRV-1B clones amplified by the PGR 
technique in the "pol" region defined by Shih (12), from 
viral DNA originating from the LM7PC and PLI-2 lines, and 
identified under the references SEQ ID NO: 3, SEQ ID NO: 4, 

15 SEQ ID NO: 5 and SEQ ID NO: 6, and the common consensus with 
amplification primers bearing the reference SEQ ID NO: 7; 

- Figure 2 gives the definition of a functional 
reading frame for each MSRV-1B/"PCR pol" type family, the 
said families A to D being defined, respectively, by the 

20 nucleotide sequences SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 
and SEQ ID NO: 6 described in Figure 1; 

- Figure 3 gives an example of consensus of the 
MSRV-2B sequences, identified by SEQ ID NO: 11; 

- Figure 4 is a representation of the reverse 
25 transcriptase (RT) activity in dpm (disintegrations per 

minute) in the sucrose fractions taken from a purification 
gradient of the virions produced by the B lymphocytes in 
culture from a patient suffering from MS; 

- Figure 5 gives, under the same experimental 
30 conditions as in Figure 4, the assay of the reverse 

transcriptase activity in the culture of a B lymphocyte 
line obtained from a control free from MS; 

- Figure 6 shows the nucleotide sequence of the 

clone PSJ17 (SEQ ID NO: 9); 
35 - Figure 7 shows the nucleotide sequence SEQ ID 

NO: 8 of the clone designated M003-P004; 
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- Figure 8 shows the nucleotide sequence SEQ ID 
NO- 2 of the clone Pll-1; the portion located between the 
two arrows in the region of the primer corresponds to . 
variability imposed by the choice of primer which was used 

5 for the cloning of Fll-1; in this same figure, the 
translation into amino acids is shown; 

- Figure 9 shows the nucleotide sequence SEQ ID 
NO-1 and a possible functional reading frame of SEQ ID 
NO-l' in terms of amino acids; on this sequence, the 

10 consensus sequences of the pol gene are underlined; 

- Figures 10 and 11 give the results of a PCR, 
in the form of a photograph under ultraviolet light of an 
ethidium bromide-impregnated agarose gel, of the amplifi- 
cation products obtained from the primers identified by 

15 SEQ ID HO: 16, SEQ ID NO:17. SEQ ID NO:18 and SEQ ID NO:19; 

- Figure 12 gives a representation in matrix 
form of the homology between SEQ ID N0:l of MSRV-l and 
that of an endogenous retrovirus designated HSERV9 ; this 
homology of at least 65% is demonstrated by a continuous 
line the absence of a line meaning a homology of less 



20 

than 65%; 



- Figure 13 shows the nucleotide sequence SEQ ID 

NO:46 of the clone FBd3 ; 

- Figure 14 shows the sequence homology between 

25 the clone FBd3 and the HSERV-9 retrovirus; 

- Figure 15 shows the nucleotide sequence SEQ ID 

NO:51 of the clone t pol; 

- Figures 16 and 17 show, respectively, the 

^ cwn -rn no- 52 and SEQ ID NO: 53 of the 
nucleotide sequences SEQ ID NO. 5^ ana 

30 clones JLBcl and JLBc2 , respectively; Mwmn 

- Figure 18 shows the sequence homology between 

the clone JLBcl and the clone FBd3 ; 

- and Figure 19 the sequence homology between 

the clone JLBc2 and the clone FBd3 ; 
3 5 - Figure 20 shows the sequence homology between 

the clones JLBcl and JLBc2; 
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- Figures 21 and 22 show the sequence homology 
between the HSERV-9 retrovirus and the clones JLBcl and 

JLBc2, respectively; 

- Figure 23 shows the nucleotide sequence SEQ ID 

5 NO: 56 of the clone GM3 ; 

_ Figure 24 shows the sequence homology between 
the HSERV-9 retrovirus and the clone GM3; 

- Figure 25 shows the localization of the 
different clones studied, relative to the genome of the 

10 known retrovirus ERV9; 

- Figure 26 shows the position of the clones 
Fll-1, M003-P004, MSRV-1B and PSJ17 in the region 
hereinafter designated MSRV-1 pol*; 

- Figure 27, split into three successive Figures 
15 27a-27c, shows a possible reading frame covering the whole 

of the pol gene; 

- Figure 28 shows, according to SEQ ID NO: 40, 
the nucleotide sequence coding for the peptide fragment 
POL2B, having the amino acid sequence identified by SEQ ID 

20 NO: 39; 

- Figure 29 shows the OD values (ELISA tests) at 
492 nm obtained for 29 sera of MS patients and 32 sera of 
healthy controls tested with an anti-IgG antibody; 

- Figure 30 shows the OD values (ELISA tests) at 
25 492 nm obtained for 36 sera of MS patients and 42 sera of 

healthy controls tested with an anti-IgM antibody; 

- Figures 31 to 33 show the results obtained 
(relative intensity of the spots) for 43 overlapping 
octapeptides covering the amino acid sequence 61-110, 

30 according to the Spotscan technique, respectively with a 
pool of MS sera, with a pool of control sera and with the 
pool of MS sera after deduction of a background corre- 
sponding to the maximum signal detected on at least one 
octapeptide with the control serum (intensity =1), on the 

35 understanding that these sera were diluted to 1/50. The 
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bar at the far right-hand end represents a graphic scale 
standard unrelated to the serological test; 

- Figure 34 shows the SEQ ID NO: 41 and SEQ ID 
NO: 42 of two polypeptides comprising immunodominant 

5 regions, while SEQ ID NO:43 and 44 represent 
immunoreactive polypeptides specific to MS; 

- Figure 35 shows the nucleotide sequence SEQ ID 
NO: 59 of the clone LB19 and three potential reading frames 
of SEQ ID NO: 59 in terms of amino acids; 

10 _ Figure 36 shows the nucleotide sequence SEQ ID 

NO:88 (GAG* ) and a potential reading frame of SEQ ID NO:88 
in terms of amino acids; 

- Figure 37 shows the sequence homology between 
the clone FBdl3 and the HSERV-9 retrovirus; according to 

15 this representation, the continuous line means a 
percentage homology greater than or equal to 70% and the 
absence of a line means a smaller percentage homology; 

- Figure 38 shows the nucleotide sequence SEQ ID 
NO: 61 of the clone FP6 and three potential reading frames 

20 of SEQ ID NO: 61 in terms of amino acids; 

- Figure 39 shows the nucleoside sequence SEQ ID 
NO: 89 of the clone G+E+A and three potential reading 
frames of SEQ ID NO: 89 in terms of amino acids; 

- Figure 4 0 shows a reading frame found in the 
25 region E and coding for an MSRV-1 retroviral protease 

identified by SEQ ID NO: 90; 

- Figure 41 shows the response of each serum of 
patients suffering from MS, indicated by the symbol (+) , 
and of healthy patients, symbolised by (-) , tested with an 

30 anti-IgG antibody, expressed as net optical density at 
492 nm; 

- Figure 42 shows the response of each serum of 
patients suffering from MS, indicated by the symbols (+) 
and (QS) , and of healthy patients (-) , tested with an 

35 anti-IgM antibody, expressed as net optical density at 
492 nm; 
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- Figure 43 shows the RT-activity profile in 
sucrose density gradients of pellets from B-cell lines 
supernatants; Control B-cell line ■ was obtained from the 
relative of a patient with mitochondriopathy . MS B-Cell 

5 line □ was obtained from a patient with definite MS; 

- Figure 44 shows the nucleotide and amino acid 
alignment of the conserved pol regions of viruses detected 
in the study (cf Example 18) by the -Pan-retrovirus" PCR. 
"Deletions" are represented by dashes and standard single- 

10 letter abbreviations are used to designate amino acids and 
nucleotides (i = inosine) . The most highly conserved VLPQG 
and YXDD regions are shown as separate blocks in bold type 
at the end of each sequence. Amino acids which are present 
in all or in all but one of the sequences are underlined. 
15 PCR primers (modified from (12)) PAN-UO and PAN-UI are 
orientated 5' to 3« (sense) whereas primer PAN-DI is 3 • to 
5- (antisense). Degeneracies are shown above (PAN-UO & 
PAN-DI) or below (PAN-UI) the PCR primer sequences. 
"X" denotes the nine base 5' extension cttggatcc, "-I" 
20 denotes the nine base 5- extension ctcaagctt. The capture 
and detector probes DpVl and CpVlb used in the ELOSA assay 
are shown below a representative MSRV-cpol sequence. At 
three positions below the translated MSRV-cpol sequence 
alternative amino acids (representing "non-silent" nucleic 
25 acid variations) are shown in italics - K and Y 
substitutions were only observed in PLI-1 derived clones 
whereas R and W were encoded by a significant proportion 
of the clones irrespective of derivation. Note that DpVl 
is peroxidase labelled and that CpVlb may be biotinylated 
30 at the 5- end if streptavidin coated plates are used. The 
name of each sequence is indicated at the left of the 
figure. 

HTLVli Human Leukaemia Virus type 1; HIV1: Human 
Immunodeficiency Virus type 1; MoMLV : Moloney-Murine 
35 Leukaemia Virus; MPMV: Mason-Pfizer Monkey Virus. ERV9 : 
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Endogenous Retrovirus 9. MSRV-cpol: Multiple sclerosis 
associated Retrovirus conserved pol region. 

- Figure 4 5 shows a phylogenic tree which is 
based on the conserved amino acid region encoded by the 
pol gene of MSRV and of representative endogenous and 
exogenous retroviruses and DNA viruses with reverse 
transcriptase. It was generated by the U.P.G.M.A. tree 
program of Geneworks® software. 

HSRV: Human Spumaretrovirus. EIAV: Equine Infectious 
Aenemia Virus. BLV: Bovine Leukaemia Virus. HIV1, HIV2 : 
Human Immunodeficiency Viruses type 1 and 2. HTLV1 and 
HTLV2: Human Leukaemia Viruses type 1 and 2. F-MuLV: 
Friend-Murine Leukaemia Virus. MoMLV : Moloney-Murine 
Leukaemia Virus. BAEV: Baboon Endogenous Virus. GaLV/ 
15 Gibbon Ape Leukaemia Virus. HUMER41: Human Endogenous 
Retroviral sequence, clone 41. IAP: Intracisternal A-type 
Particle. MPMV: Mason-Pfizer Monkey Virus. HERVK10 : Human 
Endogenous Retrovirus K10. MMTV: Mouse Mammary tumour 
Virus. HSERV9 (ERV9 database sequence): Human sequence of 
20 Endogenous Retrovirus 9. MSRV: Multiple Sclerosis 
associated Retrovirus. SIV: Simian Immunodeficiency Virus; 
RTLV-H : Reverse Transcriptase-Like Viral sequence H; SFV: 
Simian Foamy Virus; VISNA: Visna retrovirus; SIVl: Simian 
Immunodeficiency Virus type 1; SRV-2 : Simian Retrovirus 
25 type 2; SMRV-H: Squirrel Monkey Retrovirus H. 

- Figure 46 shows the MSRV sequence in the 
Protease and Reverse-Transcriptase regions of the pol 
gene. 

The aminoacid translation is aligned under the 
30 corresponding nucleotide sequence. "The region 
corresponding to the Protease ORF cloned in a recombinant 
vector and expressed in E . coli, is boxed. The regions 
corresponding to the A and B fragments amplified on plasma 
samples from MS patients are indicated by brackets. The 
35 Reverse-Transcriptase (RT) and RNase H (RNH) region is 
boxed with dotted line. The highly conserved aroinoacids 
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and/or active sites of enzyme activities of both PRT and 
RT (including RNH) are shown underlined. 

- Figure 47A illustrates the pecific detection 
of MSRV-pol RNA sequence by RT-PCR in the sucrose density 
5 fraction associated with RT-activity and in MS plasma ; 
Figure 47B shows the RT-activity profile on a sucrose 
density gradient obtained with extracellular virion 
pelleted from an MS choroid-plexus culture. The photograph 
below shows an agarose gel loaded with PGR products 
10 amplified from round 1 (STl.l) RT-PCR products with the 
ST1.2 primer set. From left to right: water control 1 from 
RT-PCR step with STl.l set; water control 2 amplified from 
water control 1 with ST1.2 nested primers; Molecular 
weight markers; Fraction n-1 to 10 corresponding^ the 
15 RT-activity profile shown above; Plasma samples CI and C2 
from healthy blood donors. Plasma samples MSI and MS2 from 

two MS patients. 

- Figure 48 shows an example of a variant and/or 
recombined sequence in the region of the pol gene defined 

20 by homology with the overlapping regions described in 
Figure 25, as GM3, MSRV-1 pol*, t pol and FBd3 . 

- Figure 49 shows the nucleotide (Figure 49A) 
and amino acid (Figure 49B) alignments of the pol region 
between clones 1, 5 and 8 of the same patient (Experiment 

25 46-7). _ F . gure 5Q shows the nucleotide (Figure 50A) 
and amino acid (Figure SOB) alignments of the pol region 
between clones 41, 43 and 42 of the same patient 

(Experiment 68-1). 

- Figure 51 shows the nucleotide (Figure 51A) 

and amino acid (Figure 51B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 176) of clones 
1 5 and 8 of the same patient (Experiment 46-7) and 
SEQ ID N0:l, and between their corresponding peptide 
35 sequences. 
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- Figure 52 shows the nucleotide (Figure 52A) 
and amino acid (Figure 52B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 169) of clones 
41, 43 and 42 of the same patient (Experiment 68-1) and 

5 SEQ ID NO:l, and between their corresponding peptide 
sequences . 

- Figure 53 shows the nucleotide (Figure 53A) 
and amino acid (Figure 53B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 176) of clones 

10 1, 5 and 8 of the same patient (Experiment 46-7) and the 

consensus sequence (SEQ ID NO: 169 ) of clones 41, 43 and 

42 of the same patient (Experiment 68-1) . 

Table 5 (at the end of the description) shows 

the sequences obtained by RT-PCR with degenerate pol 
15 primers on sucrose density gradient fractions containing 

the peak of RT-activity or its negative control (cf 

Example 18) ; and 

Table 6 (at the end of the description) shows 
the clinical data and results of MSRV-cpol detection by 
20 "Pan-retro" PCR with specific ELOSA assay, on CSF from MS 
and control patients (cf Example 18) . 

EXAMPLE l: OBTAINING CLONES DESIGNATED MSRV-1B 
AND MSRV-2B, DEFINING, RESPECTIVELY, A RETROVIRUS MSRV-1 
25 AND A COINFECTIVE AGENT MSRV2 , BY "NESTED" PCR AMPLIFICA- 
TION OF THE CONSERVED POL REGIONS OF RETROVIRUSES ON 
VIRION PREPARATIONS ORIGINATING FROM THE LM7PC AND PLI-2 
LINES 

A PCR technique derived from the technique 
30 published by Shih (12) was used. This technique enables 
all trace of contaminant DNA to be removed by treating all 
the components of the reaction medium with DNase. It 
concomitantly makes it possible, by the use of different 
but overlapping primers in two successive series of PCR 
35 amplification cycles, to increase the chances of amplify- 
ing a cDNA synthesized from an amount of RNA which is 
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small at the outset and further reduced in the sample by 
the spurious action of the DNAse on the RNA. In effect, 
the DNase is used under conditions of activity in excess 
which enable all trace of contaminant DNA to be removed 
5 before inactivation of this enzyme remaining in the sample 
by heating to 85 °C for 10 minutes. This variant of the PCR 
technique described by Shih (12) was used on a cDNA 
synthesized from the nucleic acids of fractions of 
infective particles purified on a sucrose gradient 

10 according to the technique described by H. Perron (13) 
from the "POL-2" isolate (ECACC No. V92072202) produced by 
the PLI-2 line (ECACC No, 92072201) on the one hand, and 
from the MS7PG isolate (ECACC No, V93010816) produced by 
the LM7PC line (ECACC No. 93010817) on the other hand. 

15 These cultures were obtained according to the methods 
which formed the subject of the patent applications 
published under Nos WO 93/20188 and WO 93/20189. 

After cloning the products amplified by this 
technique with the TA Cloning Kit® and analysis of the 

20 sequence using an Applied Biosystems model 373A Automatic 
Sequencer, the sequences were analysed using the 
Geneworks® software on the latest available version of the 

Genebank® data bank. 

The sequences cloned and sequenced from these 

25 samples correspond, in particular, to two types of 
sequence: a first type of sequence, to be found in the 
majority of the clones (55% of the clones originating from 
the POL-2 isolates of the PLI-2 culture, and 67% of the 
clones originating from the MS7PG isolates of the LM7PC 

30 cultures) , which corresponds to a family of "pol" 
sequences closely similar to, but different from, the 
endogenous human retrovirus designated ERV-9 or HSERV-9, 
and a second type of sequence which corresponds to 
sequences very strongly homologous to a sequence 

35 attributed to another infective and/ or pathogenic agent 
designated MSRV-2 . 
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The first type of sequence, representing the 
majority of the clones, consists of sequences whose 
variability enables four subfamilies of sequences to be 
defined. These subfamilies are sufficiently similar to one 
another for it to be possible to consider them to be 
quasi-species originating from the same retrovirus, as is 
well known for the HIV-l retrovirus (14), or to be the 
outcome of interference with several endogenous proviruses 
coregulated in the producing cells. These more or less 
defective endogenous elements are sensitive to the same 
regulatory signals possibly generated by a replicative 
provirus, since they belong to the same family of 
endogenous retroviruses (15) . This new family of 
endogenous retroviruses, or alternatively this new 
retroviral species from which the generation of quasi- 
species has been obtained in culture, and which contains a 
consensus of the sequences described below, is designated 
MSRV-1B. 

Figure 1 presents the general consensus 
20 sequences of the sequences of the different MSRV-1B clones 
sequenced in this experiment, these sequences being 
identified, respectively, by SEQ ID NO: 3, SEQ ID NO: 4, SEQ 
ID NO: 5 and SEQ ID NO: 6. These sequences display a 
homology with respect to nucleic acids ranging from 7 0% to 
88% with the HSERV9 sequence referenced X57147 and M37638 
in the Genebank® data base. Four "consensus" nucleic acid 
sequences representative of different quasi-species of a 
possibly exogenous retrovirus MSRV-1B, or of different 
subfamilies of an endogenous retrovirus MSRV-1B, have been 
defined. These representative consensus sequences are 
presented in Figure 2, with the translation into amino 
acids. A functional reading frame exists for each 
subfamily of these MSRV-1B sequences, and it can be seen 
that the functional open reading frame corresponds in each 
instance to the amino acid sequence appearing on the 
second line under the nucleic acid sequence. The general 
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consensus of the MSRV-1B sequence, identified by SEQ ID 
NO: 7 and obtained by this PCR technique in the »pol M 
region, is presented in Figure 1. 

The second type of sequence representing the 
5 majority of the clones sequenced is represented by the 
sequence MSRV-2B presented in Figure 3 and identified by 
SEQ ID NO: 11. The differences observed in the sequences 
corresponding to the PCR primers are explained by the use 
of degenerate primers in mixture form used under different 

10 technical conditions. 

The MSRV-2 B sequence (SEQ ID NO: 11) is suffic- 
iently divergent from the retroviral sequences already 
described in the data banks for it to be suggested that 
the sequence region in question belongs to a new infective 

15 agent, designated MSRV-2. This infective agent would, in 
principle, on the basis of the analysis of the first 
sequences obtained, be related to a retrovirus but, in 
view of the technique used for obtaining this sequence, it 
could also be a DNA virus whose genome codes for an enzyme 

20 which incidentally possesses reverse transcriptase 
activity, as is the case, for example, with the hepatitis 
B virus, HBV (12). Furthermore, the random nature of the 
degenerate primers used for this PCR amplification 
technique may very well have permitted, as a result of 

25 unforeseen sequence homologies or of conserved sites in 
the gene for a related enzyme, the amplification of a 
nucleic acid originating from a prokaryotic or eukaryotic 
pathogenic and/or coinf ective agent (protist) . 



EXAMPLE 2: OBTAINING CLONES DESIGNATED MSRV-1B 
AND MSRV-2 B/ DEFINING A FAMILY MSRV-1 and MSRV-2, BY 
"NESTED" PCR AMPLIFICATION OF THE CONSERVED POL REGIONS OF 
RETROVIRUSES ON PREPARATIONS OF B LYMPHOCYTES FROM A NEW 
CASE OF MS 

35 T he same PCR technique, modified according to 

the technique of Shih (12), was used to amplify and 
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sequence the RNA nucleic acid material present in a 
purified fraction of virions at the peak of "LM7-like" 
reverse transcriptase activity on a sucrose gradient 
according to the technique described by H. Perron (13), 
5 and according to the protocols mentioned in Example 1, 
from a spontaneous lymphoblastoid line obtained by self- 
immortalization in culture of B lymphocytes from an MS 
patient who was seropositive for the Epstein-Barr virus 
(EBV) , after setting up the blood lymphoid cells in 

10 culture in a suitable culture medium containing a suitable 
concentration of cyclosporin A. A representation of the 
reverse transcriptase activity in the sucrose fractions 
taken from a purification gradient of the virions produced 
by this line is presented in Figure 4. Similarly, the 

15 culture supernatants of a B line obtained under the same 
conditions from a control free from MS were treated under 
the same conditions, and the assay of reverse 
transcriptase activity in the sucrose gradient fractions 
proved negative throughout (background) , and is presented 

20 in Figure 5. Fraction 3 of the gradient corresponding to 
the MS B line and the same fraction without reverse 
transcriptase activity of the non-MS control gradient were 
analysed by the same RT-PCR technique as before, derived 
from Shih (12), followed by the same steps of cloning and 

25 sequencing as described in Example 1. 

It is particularly noteworthy that the MSRV-1 
and MSRV-2 type sequences are to be found only in the 
material associated with a peak of "LM7-like" reverse 
transcriptase activity originating from the MS B lympho- 

3 0 blastoid line. These sequences were not to be found with 
the material from the control (non-MS) B lymphoblastoid 
line in 26 recombinant clones taken at random- Only 
Mo-MuLV type contaminant sequences, originating from the 
commercial reverse transcriptase used for ' the cDNA 

35 synthesis step, and sequences without any particular 
retroviral analogy were to be found in this control, as a 
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result of the "consensus" amplification of homologous 
polymerase sequences which is produced by this PCR 
technique. Furthermore, the absence of a concentrated 
target which competes for the amplification reaction in 
5 the control sample permits the amplification of dilute 
contaminants. The difference in results is manifestly 
highly significant (chi-squared, p<0.001). 

EXAMPLE 3: OBTAINING A CLONE PSJ17, DEFINING A 
10 RETROVIRUS MSRV-1, BY REACTION OF ENDOGENOUS REVERSE 
TRANSCRIPTASE WITH A VIRION PREPARATION ORIGINATING FROM 

THE PLI-2 LINE 

This approach is directed towards obtaining 
reverse-transcribed DNA sequences from the supposedly 

15 retroviral RNA in the isolate using the reverse trans- 
criptase activity present in this same isolate. This 
reverse transcriptase activity can theoretically function 
only in the presence of a retroviral RNA linked to a 
primer tRNA or hybridized with short strands of DNA 

20 already reverse-transcribed in the retroviral particles 
(16). Thus, the obtaining of specific retroviral sequences 
in a material contaminated with cellular nucleic acids was 
optimized according to these authors by means of the 
specific enzymatic amplification of the portions of viral 

25 RNAs with a viral reverse transcriptase activity. To this 
end, the authors determined the particular physicochemical 
conditions under which this enzymatic activity of reverse 
transcription on RNAs contained in virions could be 
effective in vitro. These conditions correspond to the 

30 technical description of the protocols presented below 
(endogenous RT reaction, purification, cloning and 
sequencing) . 

The molecular approach consisted in using a 
preparation of concentrated but unpurified virion obtained 
35 from the culture supernatants of the PLI-2 line, prepared 
according to the following method: the culture 
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supernatants are collected twice weekly, precentrif uged at 
10,000 rpm for 30 minutes to remove cell debris and then 
frozen at -80°C or used as they are for the following 
steps. The fresh or thawed supernatants are centrifuged on 
5 a cushion of 30% glycerol-PBS at 100,000 g (or 30,000 rpm 
in a type 45 T LKB-HITACHI rotor) for 2 h at 4°C. After 
removal of the supernatant, the sedimented pellet is taken 
up in a small volume of PBS and constitutes the fraction 
of concentrated but unpurified virion. This concentrated 
10 but unpurified viral sample was used to perform a so- 
called endogenous reverse transcription reaction, as 

described below. 

A volume of 200 ml of virion purified according 
to the protocol described above, and containing a reverse 

15 transcriptase activity of approximately 1-5 million dpm, 
is thawed at 37°C until a liquid phase appears, and then 
placed on ice. A 5-fold concentrated buffer was prepared 
with the following components: 500 mM Tris-HCl pH 8.2; 
75 mM NaCl; 25 mM MgCl 2 ; 75 mM DTT and 0.10% NP 40; 100 ml 

20 of 5X buffer + 25 ml of a 100 mM solution of dATP + 25 ml 
of a 100 mM solution of dTTP + 25 ml of a 100 mM solution 
of dGTP + 25 »1 of a 100 mM solution of dCTP + 100 ml of 
sterile distilled water + 2 00 ml of the virion suspension 
(RT activity of 5 million DPM) in PBS were mixed and 

25 incubated at 42 °C for 3 hours. After this incubation, the 
reaction mixture is added directly to a buffered 
phenol /chloroform/ isoamyl alcohol mixture (Sigma ref. 
P 3803) ; the aqueous phase is collected and one volume of 
sterile distilled water is added to the organic phase to 

30 re-extract the residual nucleic acid material. The 
collected aqueous phases are combined, and the nucleic 
acids contained are precipitated by adding 3M sodium 
acetate pH 5.2 to 1/10 volume + 2 volumes of ethanol + 
1 ml of glycogen (Boehringer-Mannheim ref. 901 393) and 

35 placing the sample at -20»C for 4 h or overnight at +4-C. 
The precipitate obtained after centrifugation is then 
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washed with 70% ethanol and resuspended in 60 ml of 
distilled water. The products of this reaction were then 
purified, cloned and sequenced according to the protocol 
which will now be described: blunt-ended DNAs with 

5 unpaired adenines at the ends were generated: a "filling- 
in" reaction was first performed: 2 5 ml of the previously 
purified DNA solution were mixed with 2 ml of a 2.5 mM 
solution containing, in equimolar amounts, dATP + dGTP + 
dTTP + dCTP/1 ml of T4 DNA polymerase (Boehringer-Mannheim 

10 ref. 1004 786) / 5 ml of 10X "incubation buffer for 
restriction enzyme" (Boehringer-Mannheim ref. 1417 975) / 
1 ml of a 1% bovine serum albumin solution / 16 ml of 
sterile distilled water. This mixture was incubated for 
20 minutes at ll°C. 50 ml of TE buffer and 1 ml of 

15 glycogen (Boehringer-Mannheim ref. 901 393) were added 
thereto before extraction of the nucleic acids with 
phenol/chloroform/ isoamyl alcohol (Sigma ref. P 3803) and 
precipitation with sodium acetate as described above. The 
DNA precipitated after ceritrif ugation is resuspended in 

20 10 ml of 10 mM Tris buffer pH 7.5. 5 ml of this suspension 
were then mixed with 20 ml of 5X Taq buffer, 2 0 ml of 5 mM 
dATP, 1 ml (5U) of Taq DNA polymerase (AmplitaqTM) and 
54 ml of sterile distilled water. This mixture is 
incubated for 2 h at 75°C with a film of oil on the 

25 surface of the solution. The DNA suspended in the aqueous 
solution drawn off under the film of oil after incubation 
is precipitated as described above and resuspended in 2 ml 
of sterile distilled water. The DNA obtained was inserted 
into a plasmid using the TA CloningTM kit. The 2 ml of DNA 

30 solution were mixed with 5 ml of sterile distilled water, 
1 ml of a 10-fold concentrated ligation buffer M 10X 
LIGATION BUFFER", 2 ml of "pCR™ VECTOR" (25 ng/ml) and 
1 ml of "TA DNA LIGASE". This mixture was incubated 
overnight at 12 °C. The following steps were carried out 

35 according to the instructions of the TA Cloning™ kit 
(British Biotechnology). At the end of the procedure, the 
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white colonies of recombinant bacteria white) were picked 
out in order to be cultured and to permit extraction of 
the plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
5 each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 
sequencing of the insert, after hybridization with a 
10 primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA cloning™ kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
15 (Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

Discriminating analysis on the computerized data 
20 banks of the sequences cloned from the DNA fragments 
present in the reaction mixture enabled a retroviral type 
sequence to be revealed. The corresponding clone PSJ17 was 
completely sequenced, and the sequence obtained, presented 
in Figure 6 and identified by SEQ ID NO: 9, was analysed 
25 using the "Geneworks®" software on the updated "Genebank™" 
data banks. An identical sequence already described could 
not be found by analysis of the data banks. Only a partial 
homology with some known retroviral elements was to be 
found. The most useful relative homology relates to an 
30 endogenous retrovirus designated ERV-9, or HSERV-9, 
according to the references (18). 
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EXAMPLE 4: PCR AMPLIFICATION OF THE NUCLEIC ACID 
SEQUENCE CONTAINED BETWEEN THE 5* REGION DEFINED BY THE 
CLONE "POL MSRV-1B" AND THE 3« REGION DEFINED BY THE CLONE 
PSJ17 

5 Five oligonucleotides, M001, M002-A, M003-BCD, 

P004 and P005, were defined in order to amplify the RNA 
originating from purified POL-2 virions. Control reactions 
were performed so as to check for the presence of 
contaminants (reaction with water) . The amplification 
10 consists of an RT-PCR step according to the protocol 
described in Example 2, followed by a "nested" PCR 
according to the PCR protocol described in the document 
E p_ A _0,569,272. In the first RT-PCR cycle, the primers 
M001 and P004 or P005 are used. In the second PCR cycle, 
15 the primers M002-A or M003-BCD and the primer P004 are 
used. The primers are positioned as follows: 
M002-A 
M003-BCD 

M001 P004 P005 



POL-2 

< > < 

pol MSRV-1B PSJ17 



RNA 



Their composition is: 
primer M001: GGTCITICCICAIGG (SEQ ID NO: 20) 
primer M002-A: TTAGGGATAGCCCTCATCTCT (SEQ ID NO: 21) 
primer M003-BCD: TCAGGGATAGCCCCCATCTAT (SEQ ID NO: 22) 
30 primer P004: AACCCTTTGCCACTACATCAATTT (SEQ ID NO: 23) 
primer P005: GCGTAAGGACTCCTAGAGCTATT (SEQ ID NO: 24) 

The "nested" amplification product obtained, and 
designated M003-P004, is presented in Figure 7, and 
corresponds to the sequence SEQ ID NO: 8. 
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EXAMPLE 5: AMPLIFICATION AND CLONING OF A 
PORTION OF THE MSRV-1 RETROVIRAL GENOME USING A SEQUENCE 
ALREADY IDENTIFIED, IN A SAMPLE OF VIRUS PURIFIED AT THE 
PEAK OF REVERSE TRANSCRIPTASE ACTIVITY 
5 A PCR technique derived from the technique 

published by Frohman (19) was used. The technique derived 
makes it possible, using a specific primer at the 3' end 
of the genome to be amplified, to elongate the sequence 
towards the 5' region of the genome to be analysed. This 
10 technical variant is described in the documentation of the 
firm "Clontech Laboratories Inc.", (Palo-Alto California, 
USA) supplied with its product "5 ' -AmpliFINDERTM RACE 
Kit", which was used on a fraction of virion purified as 
described above. 

15 The specific 3 ' primers used in the kit protocol 

for the synthesis of the cDNA and the PCR amplification 
are, respectively, complementary to the following MSRV-1 
sequences : 

cDNA : TCATCCATGTACCGAAGG (SEQ ID NO: 25) 

20 amplification : ATGGGGTTCCCAAGTTCCCT (SEQ ID NO: 26) 

The products originating from the PCR were 
obtained after purification on agarose gel according to 
conventional methods (17), and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA CloningTM kit 
(British Biotechnology) . The 2 ml of DNA solution were 

30 mixed with 5 ml of sterile distilled water, 1 ml of a 10- 
fold concentrated ligation buffer "10X LIGATION BUFFER", 
2 ml of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA 
LIGASE". This mixture was incubated overnight at 12°C. The 
following steps were carried out according to the 

35 instructions of the TA CloningTM kit (British Bio- 
technology), At the end of the procedure, * a white 
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colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called "mini- 
prep" procedure (17) . The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "Automatic Sequencer model 
37 3 A" apparatus according to the manufacturer 1 s 
instructions . 

This technique was applied first to two 

20 fractions of virion purified as described below on sucrose 
from the "POL-2" isolate produced by the PLI-2 line on the 
one hand, and from the MS7PG isolate produced by the LM7PC 
line on the other hand. The culture supernatants are 
collected twice weekly, precentrifuged at 10,000 rpm for 

25 30 minutes to remove cell debris and then frozen at -80°C 
or used as they are for the following steps. The fresh or 
thawed supernatants are centrifuged on a cushion of 3 0% 
glycerol-PBS at 100,000 g (or 30,000 rpm in a type 45 T 
LKB -HITACHI rotor) for 2 h at 4°C. After removal of the 

30 supernatant, the sedimented pellet is taken up in a small 
volume of PBS and constitutes the fraction of concentrated 
but unpurified virions. The concentrated virus is then 
applied to a sucrose gradient in sterile PPS buffer (15 to 
50% weight /weight) and ultracentrif uged at 35,000 rpm 

35 (100,000 g) for 12 h at +4°C in a swing-out rotor. 
10 fractions are collected, and 20 ml are withdrawn from 
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each fraction after homogenization to assay the reverse 
transcriptase activity therein according to the technique 
described by H. Perron (3). The fractions containing the 
peak of "LM7-like M RT activity are then diluted in sterile 
5 PBS buffer and ultracentrifuged for one hour at 35,000 rpm 
(100,000 g) to sediment the viral particles. The pellet of 
purified virion thereby obtained is then taken up in a 
small volume of a buffer which is appropriate for the 
extraction of RNA. The cDNA synthesis reaction mentioned 
10 above is carried out on this RNA extracted from purified 
extracellular virion. PCR amplification according to the 
technique mentioned above enabled the clone Fl-11 to be 
obtained, whose sequence, identified by SEQ ID NO:2, is 

presented in Figure 8. 

15 This clone makes it possible to define, with the 

different clones previously sequenced, a region of 
considerable length (1.2 kb) representative of the "pol" 
gene of the MSRV-l retrovirus, as presented in Figure 9. 
This sequence, designated SEQ ID NO:l, is reconstituted 

20 from different clones overlapping one another at their 
ends, correcting the artefacts associated with the primers 
and with the amplification or cloning techniques which 
would artificially interrupt the reading frame of the 
whole. This sequence will be identified below under the 

25 designation "MSRV-l pol* region". Its degree of homology 
with the HSERV-9 sequence is shown in Figure 12. 

In Figure 9, the potential reading frame with 
its translation into amino acids is presented below the 
nucleic acid sequence. 

30 

EXAMPLE 6: DETECTION OF SPECIFIC MSRV-l and 
MSRV-2 SEQUENCES IN DIFFERENT SAMPLES OF PLASMA 
ORIGINATING FROM PATIENTS SUFFERING FROM MS OR FROM 
CONTROLS 

35 a PCR technique was used to detect the MSRV-l 

and MSRV-2 genomes in plasmas obtained after taking blood 
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samples from patients suffering from MS and from non-MS 
controls onto EDTA. 

Extraction of the RNAs from plasma was performed 
according to the technique described by P. Chomzynski 
(20) , after adding one volume of buffer containing 
guanidinium thiocyanate to 1 ml of plasma stored frozen at 
-80°C after collection. 

For MSRV-2, the PCR was performed under the same 
conditions and with the following primers: 

- 5« primer, identified by SEQ ID NO: 14 
5* GTAGTTCGATGTAGAAAGCG 3'; 

- 3' primer, identified by SEQ ID N0:15 
5 • GCATCCGGCAACTGCACG 3 ' . 

However, similar results were also obtained with 
15 the following PCR primers in two successive amplifications 
by "nested" PCR on samples of nucleic acids not treated 
with DNase. 

The primers used for this first step of 
40 cycles with a hybridization temperature of 48 "C are the 
20 following: 

- 5' primer, identified by SEQ ID NO: 27 

5 • GCCGATATCACCCGCCATGG 3 • , corresponding to a 
5* MSRV-2 PCR primer, for a first PCR on samples from 
patients, 

25 - 3« primer, identified by SEQ ID NO:28 

5 • GCATCCGGCAACTGCACG 3', corresponding to a 3 • 
MSRV-2 PCR primer, for a first PCR on samples from 
patients . 

After this step, 10 ml of the amplification 
30 product are taken and used to carry out a second, 
so-called "nested" PCR amplification with primers located 
within the region already amplified. This second step 
takes place over 35 cycles, with a primer hybridization 
("annealing") temperature of 50°C. The reaction volume is 
35 100 ml. 
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The primers used for this second step are the 

following: 

- 5' primer, identified by SEQ ID NO:29 

5 • CGCGATGCTGGTTGGAG AG C 3 • , corresponding to a 
5 5' MSRV-2 PCR primer, for a nested PCR on samples from 
patients, 

- 3' primer, identified by SEQ ID NO: 30 

5 • TCTCCACTCCGAATATTCCG 3 • , corresponding to a 
3' MSRV-2 PCR primer, for a nested PCR on samples from 
10 patients. 

.-s For MSRV-1, the amplification was performed in 

two steps. Furthermore, the nucleic acid sample is treated 
beforehand with DNase, and a control PCR without RT (AMV 
reverse transcriptase) is performed on the two 
15 amplification steps so as to verify that the RT-PCR 
amplification comes exclusively from the MSRV-1 RNA. In 
the event of a positive control without RT, the initial 
aliquot sample of RNA is again treated with DNase and 

amplified again. 

2o The protocol for treatment with DNase lacking 

RNAse activity is as follows: the extracted RNA is 
aliquoted in the presence of "RNAse inhibitor" 
(Boehringer-Mannheim) in water treated with DEPC at a 
final concentration of 1 mg in 10 ml; to these 10 ml, 1 ml 

25 of "RNAse-free DNAse" (Boehringer-Mannheim) and 1.2 ml of 
pH 5 buffer containing 0.1 M/l sodium acetate and 5 mM/1 
MgS0 4 is added; the mixture is incubated for 15 min at 

20°C and brought to 95°C for 1.5 min in a "thermocycler" . 

The first MSRV-1 RT-PCR step is performed 

30 according to a variant of the RNA amplification method as 
described in Patent Application No. EP-A-0,569, 272 . In 
particular, the cDNA synthesis step is performed at 42 °C 
for one hour; the PCR amplification takes place over 
40 cycles, with a primer hybridization ("annealing") 

35 temperature of 53 °C. The reaction volume is 100 ml. 
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The primers used for this first step are the 

following: 

- 5* primer, identified by SEQ ID NO: 16 
5 1 AGGAGTAAGGAAACCCAACGGAC 3 • ; 

5 - 3' primer, identified by SEQ ID N0:17 

5 ■ TAAGAGTTGCACAAGTGCG 3 • . 

After this step, 10 ml of the amplification 
product are taken and used to carry out a second, so- 
called "nested" PCR amplification with primers located 
10 within the region already amplified. This second step 
takes place over 3 5 cycles, with a primer hybridization 
("annealing") temperature of 53°C. The reaction volume is 
100 ml. 

The primers used for this second step are the 

15 following: 

- 5 1 primer, identified by SEQ ID NO: 18 
5 1 TC AGGG AT AG CC C C C AT CT AT 3 • ; 

- 3 f primer, identified by SEQ ID NO: 19 
5 ■ AACCCTTTGCCACTACATCAATTT 3 • . 

20 Figures 10 and 11 present the results of PCR in 

the form of photographs under ultraviolet light of 
ethidium bromide- impregnated agarose gels, in which an 
electrophoresis of the PCR amplification products applied 
separately to the different wells was performed. 

25 The top photograph (Figure 10) shows the result 

of specific MSRV-2 amplification. 

Well number 8 contains a mixture of DNA 
molecular weight markers, and wells 1 to 7 represent, in 
order, the products amplified from the total RNAs of 

30 plasmas originating from 4 healthy controls free from MS 
(wells 1 to 4) and from 3 patients suffering from MS at 
different stages of the disease (wells 5 to 7) . 

In this series, MSRV-2 nucleic acid material is 
detected in the plasma of one case of MS out of the 3 

3 5 tested, and in none of the 4 control plasmas. Other 
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results obtained on more extensive series confirm these 
results. 

The bottom photograph (Figure 11) shows the 
result of specific amplification by MSRV-1 "nested" 
5 RT-PCR: 

well No. 1 contains the PCR product produced 
with water alone, without the addition of AMV reverse 
transcriptase; well No. 2 contains the PCR product 
produced with water alone, with the addition of AMV 
10 reverse transcriptase; well number 3 contains a mixture of 
DNA molecular weight markers; wells 4 to 13 contain, in 
order, the products amplified from the total RNAs 
extracted from sucrose gradient fractions (collected in a 
downward direction) , on which gradient a pellet of virion 
15 originating from a supernatant of a culture infected with 
MSRV-1 and MSRV-2 was centrifuged to equilibrium according 
to the protocol described by H. Perron (13); to well 14 
nothing was applied; to wells 15 to 17, the amplified 
products of RNA extracted from plasmas originating from 3 
20 different patients suffering from MS at different stages 
of the disease were applied. 

The MSRV-1 retroviral genome is indeed to be 
found in the sucrose gradient fraction containing the peak 
of reverse transcriptase activity measured according to 
25 the technique described by H. Perron (3), with a very 
strong intensity (fraction 5 of the gradient, placed in 
well No. 8) . A slight amplification has taken place in the 
first fraction (well No. 4), probably corresponding to RNA 
released by lysed particles which floated at the surface 
30 of the gradient; similarly, aggregated debris has 
sedimented in the last fraction (tube bottom), carrying 
with it a few copies of the MSRV-1 genome which have given 
rise to an amplification of low intensity. 

Of the 3 MS plasmas tested in this series, MSRV- 
35 1 RNA turned up in one case, producing a very intense 
amplification (well No. 17). 
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In this series, the MSRV-1 retroviral RNA 
genome, probably corresponding to particles of 
extracellular virus present in the plasma in extremely 
small numbers, was detected by "nested" RT-PCR in one case 
5 of MS out of the 3 tested. Other results obtained on more 
extensive series confirm these results. 

Furthermore, the specificity of the sequences 
amplified by these PCR techniques may be verified and 
evaluated by the "ELOSA" technique as described by 
10 F. Mallet (21) and in the document FR-A-2 , 663 , 040 . 

For MSRV-1, the products of the nested PCR 
described above may be tested in two ELOSA systems 
enabling a consensus A and a consensus B+C+D of MSRV-1 to 
be detected separately, corresponding to the subfamilies 
15 described in Example 1 and Figures l and 2. In effect, the 
sequences closely resembling the consensus B+C+D are to be 
found essentially in the RNA samples originating from 
MSRV-1 virions purified from cultures or amplified in 
extracellular biological fluids of MS patients, whereas 
20 the sequences closely resembling the consensus A are 
essentially to be found in normal human cellular DNA. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PCR products of the 
subfamily A uses a capture oligonucleotide cpVIA with an 
25 amine bond at the 5' end and a biotinylated detection 
oligonucleotide dpVlA having as their sequence, 

respectively: 

- cpVIA identified by SEQ ID NO: 31 

5 • GATCTAGGCCACTTCTCAGGTCCAGS 3 • , corresponding 
30 to the ELOSA capture oligonucleotide for the products of 
MSRV-1 nested PCR performed with the primers identified by 
SEQ ID NO: 16 and SEQ ID NO: 17, optionally followed by 
amplification with the primers identified by SEQ ID NO: 18 
and SEQ ID NO: 19 on samples from patients; 
35 - dpVIA identified by SEQ ID NO: 32; 
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5« CATCTITTTGGICAGGCAITAGC 3', corresponding to 
the ELOSA capture oligonucleotide for the subfamily A of 
the products of MSRV-1 "nested" PCR performed with the 
primers identified by SEQ ID NO: 16 and SEQ ID NO: 17 , 
5 optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
from patients. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PCR products of the 
10 subfamily B+C+D uses the same biotinylated detection 
oligonucleotide dpVIA and a capture oligonucleotide cpVlB 
with an amine bond at the 5' end having as its sequence: 

- dpVlB identified by SEQ ID NO: 33 

5 • CTTGAGCCAGTTCTCATACCTGGA 3 ' , corresponding to 
15 the ELOSA capture oligonucleotide for the subfamily B + C 
+ D of the products of MSRV-1 "nested" PCR performed with 
the primers identified by SEQ ID NO: 16 and SEQ ID NO: 17, 
optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 

20 from patients. 

This ELOSA detection system enabled it to be 
verified that none of the PCR products thus amplified from 
DNase-treated plasmas of MS patients contained a sequence 
of the subfamily A, and that all were positive with the 
25 consensus of the subfamilies B, C and D. 

For MSRV-2, a similar ELOSA technique was evalu- 
ated on isolates originating from infected cell cultures, 
using the following PCR amplification primers, 

- 5« primer, identified by SEQ ID NO: 34 

30 5' AGTGYTRCCMCARGGCGCTGAA 3', corresponding to a 

5' MSRV-2 PCR primer, for PCR on samples from cultures, 

- 3' primer, identified by SEQ ID NO:35 

5 • GMGGCCAGCAGSAKGTCATCCA 3 ' , corresponding to a 
3' MSRV-2 PCR primer, for PCR on samples from cultures, 
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and the capture oligonucleotides with an amine 
bond at the 5- end c P V2 and the biotinylated detection 
oligonucleotide d P V2 having as their respective sequences: 
- cpV2 identified by SEQ ID NO: 36 
5 5 GGATGCCGCCTATAGCCTCTAC 3 • , ■ corresponding to an 

ELOSA capture oligonucleotide for the products of MSRV-2 
PGR performed with the primers SEQ ID NO: 34 and SEQ ID 
NO: 35, or optionally with the degenerate primers defined 

by shih (12) . 
10 - dpV2 identified by SEQ ID NO: 37 

5 • AAGCCTATCGCGTGCAGTTGCC 3 1 , corresponding to 
an ELOSA detection oligonucleotide for the products of 
MSRV-2 PCR performed with the primers SEQ ID NO: 34 and SEQ 
ID NO: 35, or optionally with the degenerate primers 

15 defined by Shih (12) 

This PCR amplification system with a pair of 
primers different from those which were described previ- 
ously for amplification on the samples from patients made 
it possible to confirm the infection with MSRV-2 of in 
vitro cultures and of samples of nucleic acids used for 
the molecular biology studies. 

All things considered, the first results of PCR 
detection of the genome of pathogenic and/or infective 
agents show that it is possible that free "virus" may 
circulate in the blood stream of patients in an acute, 
virulent phase, outside the nervous system. This is 
compatible with the almost invariable presence of "gaps" 
in the blood-brain barrier of patients in an active phase 



30 



Of MS. 



EXAMPLE 7: OBTAINING SEQUENCES OP THE "env" GENE 
OF THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
PCR technique derived from the technique published by 
35 Frohman (19) was used. The technique derived makes it 
possible, using a specific primer at the 3' end of the 
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genome to be amplified, to elongate the sequence towards 
the 5' region of the genome to be analysed. This technical 
variant is described in the documentation of "Clontech 
Laboratories Inc., (Palo-Alto California, USA) supplied 

5 with its product "5 * -AmpliFINDER™ RACE Kit", which was 
used on a fraction of virion purified as described above. 

In order to carry out an amplification of the 3* 
region of the MSRV-1 retroviral genome encompassing the 
region of the M env" gene, a study was carried out to 

10 determine a consensus sequence in the LTR regions of the 
same type as those of the defective endogenous retrovirus 
HSERV-9 (18, 24), with which the MSRV-1 retrovirus 
displays partial homologies. 

The same specific 3 ' primer was used in the kit 

15 protocol for the synthesis of the cDNA and the PCR 
amplification; its sequence is as follows: 

GTGCTGATTGGTGTATTTACAATCC (SEQ ID NO 45) 
Synthesis of the complementary DNA (cDNA) and 
unidirectional PCR amplification with the above primer 

20 were carried out in one step according to the method 
described in Patent EP-A-0, 569 , 272 . 

The products originating from the PCR were 
extracted after purification of agarose gel according to 
conventional methods (17) , and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3* end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
Biotechnology) . The 2 ml of DNA solution were mixed with 5 

30 ml of sterile distilled water, 1 ml of a 10-fold 
concentrated ligation buffer "10X LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE". 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning® kit (British Biotechno- 
logy) . At the end of the procedure, the white colonies of 
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recombinant bacteria (white) were picked out in order to 
be cultured and to permit extraction of the plasmids 
incorporated according to the so-called "miniprep" 
procedure (17). The plasmid preparation from each 
recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 
the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 
15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "automatic sequencer, model 
373 A" apparatus according to the manufacturer's 

instructions . 

This technical approach was applied to a sample 

20 of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 
reverse transcriptase activity which is detectable 

25 according to the technique described by Perron at al. (3): 
the culture supernatants are collected twice weekly, 
precentrifuged at 10,000 rpm for 3 0 minutes to remove cell 
debris and then frozen at -80 "C or used as they are for 
the following steps. The fresh or thawed supernatants are 

30 centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
for 2 h at 4»C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpur if ted virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 

35 for the extraction of RNA. The cDNA synthesis reaction 
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mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion. 

RT-PCR amplification according to the technique 
mentioned above enabled the clone FBd3 to be obtained, 
5 whose sequence, identified by SEQ ID NO: 46, is presented 

in Figure 13. 

In Figure 14, the sequence homology between the 
clone FBd3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line for any partial homology 

10 greater than or equal to 65%. It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5' end and with the env gene and then 
the LTR at the 3 • end) , but that the internal region is 
totally divergent and does not display any homology, even 

15 weak, with the "env" gene of HSERV9 . Furthermore, it is 
apparent that the clone FBd3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9; it may thus be seen that the internal 
divergent region constitutes an "insert" between the 

20 regions of partial homology with the HSERV-9 defective 
genes . 

EXAMPLE 8: AMPLIFICATION, CLONING AND SEQUENCING 
OF THE REGION OF THE MSRV-1 RETROVIRAL GENOME LOCATED 

25 BETWEEN THE CLONES PSJ17 AND FBd3 

Four oligonucleotides, Fl, B4, F6 and Bl, were 
defined for amplifying RNA originating from concentrated 
virions of the strains P0L2 and MS7PG. Control reactions 
were performed so as to check for the presence of 

30 contaminants (reaction with water). The amplification 
consists of a first step of RT-PCR according to the 
protocol described in Patent Application EP-A-0,569,272, 
followed by a second step of PCR performed on 10 ml of 
product of the first step with primers internal to the 

35 amplified first region ("nested" PCR). In the first RT-PCR 
cycle, the primers Fl and B4 are used. In the second PCR 
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cycle, the primers F6 and the primer Bl are used. The 
primers are positioned as follows: 

Fl F6 B1 B4 



RNA 



MSRV-1 
PSJ17 

> 



FBd3 

< /- 



10 5'pol MSRV-1 3' pol MSRV-1 / 

5 1 env 

Their composition is: 

primer Fl: TGATGTGAACGGCATACTCACTG (SEQ ID NO: 47) 
15 primer B4 : CCCAGAGGTTAGGAACTCCCTTTC (SEQ ID NO 48) 

primer F6: GCTAAAGGAGACTTGTGGTTGTCAG (SEQ ID NO 49) 

primer Bl : CAACATGGGCATTTCGGATTAG (SEQ ID NO 50) 

The product of M nested" amplification obtained 

and designated "t pol" is presented in Figure 15, and 
20 corresponds to the sequence SEQ ID NO: 51. 

EXAMPLE 9: OBTAINING NEW SEQUENCES , EXPRESSED AS 
RNA IN CELLS IN CULTURE PRODUCING MSRV-1, AND COMPRISING 
AN "env" REGION OP THE MSRV-1 RETROVIRAL GENOME 

25 a library of cDNA was produced according to the 

procedure described by the manufacturer of the "cDNA 
synthesis module, cDNA rapid adaptator ligation module, 
cDNA rapid cloning module and lambda gtlO in vitro 
packaging module" kits (Amersham, ref RPN1256Y/Z, RPN1712, 

30 RPN1713, RPN1717, N334Z) , from the messenger RNA extracted 
from cells of a B lymphoblastoid line such as is described 
in Example 2, established from the lymphocytes of a 
patient suffering from MS and possessing reverse 
transcriptase activity which is detectable according to 

35 the technique described by Perron et al. (3). 
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Oligonucleotides were defined for amplifying the 
cDNA cloned into the nucleic acid library between the 3 1 
region of the clone PSJ17 (pol) and the 5'(LTR) region of 
the clone FBd3 . Control reactions were performed so as to 
5 check for the presence of contaminants (reaction with 
water) . PCR reactions performed on the nucleic acids 
cloned into the library with different pairs of primers 
enabled a series of clones linking pol sequences to the 
MSRV-1 type env or LTR sequences to be amplified. 
!0 Two clones are representative of the sequences 

obtained in the cellular cDNA library: 

- the clone JLBcl, whose sequence SEQ ID NO: 52 is pre- 
sented in Figure 16; 

- the clone JLBc2 , whose sequence SEQ ID NO: 53 is pre- 
15 sented in Figure 17. 

The sequences of the clones JLBcl and JLBc2 are 
homologous to that of the clone FBd3 , as is apparent in 
Figures 18 and 19. The homology between the clone JLBcl 
and the clone JLBc2 is shown in Figure 20. 

20 The homologies between the clones JLBcl and 

JLBc2 on the one hand and the HSERV9 sequence on the other 
hand are presented, respectively, in Figures 21 and 22. 

It will be noted that the region of homology 
between JLB1, JLB2 and FBd3 comprises, with a few sequence 

25 and size variations of the "insert", the additional 
sequence absent ("inserted") in the HSERV-9 env sequence, 
as described in Example 8. 

It will also be noted that the cloned "pol" 
region is very homologous to HSERV-9, does not possess a 

30 reading frame (bearing in mind the sequence errors induced 
by the techniques used, including even the automatic 
sequencer) and diverges from the MSRV-1 sequences obtained 
from virions. In view of the fact that these sequences 
were cloned from the RNA of cells expressing MSRV-1 

35 particles, it is probable that they originate from 
endogenous retroviral elements related to the ERV9 family; 
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this is all the more likely for the fact that the pol and 
env genes are present on the same RNA which is clearly not 
the MSRV-1 genomic RNA. Some of these ERV9 elements 
possess functional LTRs which can be activated by 
5 replicative viruses coding for homologous or heterologous 
transactivators. Under these conditions, the relationship 
between MSRV-1 and HSERV-9 makes probable the 
transactivation of the defective (or otherwise) endogenous 
ERV9 elements by homologous, or even identical, MSRV-1 
10. transactivating proteins. 

Such a phenomenon may induce a viral interfer- 
ence between the expression of MSRV-1 and the related 
endogenous elements. Such an interference generally leads 
to a so-called "defective-interfering" expression, some 
features of which were to be found in the MSRV-1- infected 
cultures studied. Furthermore, such a phenomenon does not 
lack generation of the expression of polypeptides, or even 
of endogenous retroviral proteins which are not 
necessarily tolerated by the immune system. Such a scheme 
of aberrant expression of endogenous elements related to 
MSRV-1 and induced by the latter is liable to multiply the 
aberrant antigens, and hence to contribute to the 
induction of autoimmune processes such as are observed in 
MS. 

It is, however, essential to note that the 
clones JLBCI and JLBc2 differ from the ERV9 or HSERV9 
sequence already described, in that they possess a longer 
env region comprising an additional region totally 
divergent from ERV9 . Their kinship with the endogenous 
3 0 ERV9 family may hence be defined, but they clearly 
constitute novel elements never hitherto described. In 
effect, interrogation of the data banks of nucleic acid 
sequences available in version No. 15 (1995) of the 
••Entrez" software (NCBI, NIH, Bethesda, USA) did not 
enable a known homologous sequence in the env region of 
these clones to be identified. 
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EXAMPLE 10 : OBTAINING SEQUENCES LOCATED IN THE 
5' pol AND 3' gag REGION OP THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
PCR technique derived from the technique published by 
Frohman (19) was used. The technique derived makes i* 
possible, using a specific primer at the 3- end of the 
genome to be amplified, to elongate the sequence towards 
the 5' region of the genome to be analysed. This technical 
variant is described in the documentation of the firm 
Clontech Laboratories Inc., (Palo-Alto California, USA), 
supplied with its product "5 ' -AmpliFINDER™ RACE Kit", 
which was used on a fraction of virion purified as 

described above. 

In order to carry out an amplification of the 5' 
region of the MSRV-1 retroviral genome starting from the 
pol sequence already sequenced (clone Fll-1) and extending 
towards the gag gene, MSRV-1 specific primers were 
defined. 

The specific 3' primers used in the kit protocol 
for the synthesis of the cDNA and the PCR amplification 
are, respectively, complementary to the following MSRV-1 
sequences : 

CDNA: (SEQ ID NO: 54) 

CCTGAGTTCTTGCACTAACCC 
amplification: (SEQ ID NO: 55) 
GTCCGTTGGGTTTCCTTACTCCT 

The products originating from the PCR were 
extracted after purification on agarose gel according to 
conventional methods (17), and then resuspended in 10 ml 
of distilled water. Since one of the properties of Tag 
polymerase consists in adding an adenine at the 3- end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
Biotechnology) . The 2 ml of DNA solution were mixed with 5 
ml of sterile distilled water, 1 ml of a 10-fold 
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concentrated ligation buffer "10X LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE". 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

5 instructions of the TA Cloning® kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 

10 "miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 

15 seguencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning™ Kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 

20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"automatic sequencer model 373 A" apparatus according to 
the manufacturer's instructions. 

25 This technical approach was applied to a sample 

of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 

30 reverse transcriptase activity which is detectable 
according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentrifuged at 10,000 rpm for 30 minutes to remove cell 
debris and then frozen at -80 °C or used as they are for 

35 the following steps. The fresh or thawed supernatants are 
centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
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for 2 h at 4°C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 
5 for the extraction of RNA. The cDNA synthesis reaction 
mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion. 

RT-PCR amplification according to the technique 
mentioned above enabled the clone GM3 to be obtained, 
10 whose sequence, identified by SEQ ID NO 56, is presented 

in Figure 23. 

In Figure 24, the sequence homology between the 
clone GMP3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line, for any partial 

15 homology greater than or equal to 65%. 

In summary. Figure 25 shows the localization of 
the different clones studied above, relative to the known 
ERV9 genome. In Figure 25, since the MSRV-l env region is 
longer, than the reference ERV9 env gene, the additional 

20 region is shown above the point of insertion according to 
a "V", on the understanding that the inserted material 
displays a sequence and size vari-ability between the 
clones shown (JLBcl, JLBc2, FBd3) . And Figure 26 shows the 
position of different clones studied in the MSRV-l pol* 

25 region. 

By means of the clone GM3 described above, a 
possible reading frame could be defined, covering the 
whole of the pol gene, referenced according to SEQ ID 
NO: 57, shown in the successive Figures 27a to 27c. 

30 

EXAMPLE 11: DETECTION OF ANTI-MSRV-1 SPECIFIC 

ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 
of the MSRV-l retrovirus and of an open reading frame of 
35 this gene enabled the amino acid sequence SEQ ID NO: 39 of 
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a region of the said gene, referenced SEQ ID NO: 40, to be 
determined (see Figure 28) . 

Different synthetic peptides corresponding to 
fragments of the protein sequence of MSRV-1 reverse 
5 transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
solid-phase synthesis according to the Merrifield tech- 
10 nique (Barany G, and Merrifielsd R.B, 1980, In the 
Peptides, 2, 1-284, Gross E and Meienhofer J, Eds., 
Academic Press, New York) . The practical details are those 
described below. 

a) Peptide synthesis: 
15 The peptides were synthesized on a phenylacet- 

amidoraethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
"Applied Biosystems 430A" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 
20 (HOBT) esters. The amino acids used are obtained from 
Novabiochem (Lauf lerlf ingen, Switzerland) or Bachem 
(Bubendorf, Switzerland). 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 
25 solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 
hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Institute, Osaka, Japan). 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 
30 anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2»C. The HF is then 
evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 
35 The peptides are purified by preparative high 

performance liquid chromatography on a VYDAC C18 type 
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column (250 x 21 mm) (The Separation Group, Hesperia, CA, 
USA) . Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
VYDAC® C18 analytical column (250 x 4.6 mm) at a flow rate 
of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above. The 
peptide which is considered to be of acceptable purity 
manifests itself in a single peak representing not less 
than 95% of the chr omatogr am . 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using 
15 an Applied Biosystems 420H automatic amino acid analyser. 
Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
the positive ion mode on a VG. ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2 000 acquisition system 
20 (VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated P0L2B to be selected, whose sequence is shown 
25 in Figure 28 in the identifier SEQ ID NO:39, below, 
encoded by the pol gene of MSRV-1 (nucleotides 181 to 
330) . 

b) Antigenic properties: 

The antigenic properties of the POL2B peptide 
30 were demonstrated according to the ELISA protocol 

described below. 

The lyophilized P0L2B peptide was dissolved in 
sterile distilled water at a concentration of 1 mg/ml. 
This stock solution was aliquoted and kept at +4-C for use 
over a fortnight, or frozen at -20-C for use within 2 
months. An aliquot is diluted in PBS (phosphate buffered 
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saline) solution so as to obtain a final peptide 
concentration of 1 microgram/ml. 100 microlitres of this 
dilution are placed in each well of microtitration plates 
("high-binding" plastic, COSTAR ref: 3590). The plates are 
5 covered with a "plate-sealer- type adhesive and kept 
overnight at + 4'C for the phase of adsorption of the 
peptide to the plastic. The adhesive is removed and the 
plates are washed three times with a volume of 300 micro- 
litres of a solution A (IX PBS, 0.05% Tween 20®), then 
10 inverted over an absorbent tissue. The plates thus drained 
a~ filled with 2*00 microlitres per well of a solution B 
(solution A + 10% of goat serum) , then covered with an 
adhesive and incubated for 45 minutes to 1 hour at 37*0. 
The plates are then washed three times with the solution A 

15 as described above. 

The test serum samples are diluted beforehand to 
1/50 in the solution B, and 100 microlitres of each dilute 
test serum are placed in the wells of each microtitration 
plate. A negative control is placed in one well of each 
20 plate, in the form of 100 microlitres of buffer B. The 
plates covered with an adhesive are then incubated for 1 
to 3 hours at 37 »C. The plates are then washed three times 
with the solution A as described above. In parallel, a 
peroxidase-labelied goat antibody directed against human 
25 igG (Sigma Immunochemicals ref. A6029) or IgM (Cappel ref. 
55228) £ diluted in the solution B (dilution 1/5000 for 
the anti-IgG and 1/1000 for the anti-IgM) . 100 microlitres 
of the appropriate dilution of the labelled antibody are 
then placed in each well of the microtitration plates, and 
30 the plates covered with an adhesive are incubated for 1 to 
2 hours at 37 -C. A further washing of the plates is then 
performed as described above. In parallel the Peroxidase 
substrate is prepared according to t*e directions of the 
"Sigma fast OPD kit" (Sigma Immunochemicals, ref. P9187). 
35 100 microlitres of substrate solution are placed in each 
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well, and the plates are placed protected from light for 
20 to 30 minutes at room temperature. 

When the colour reaction has stabilized, the 
plates are placed immediately in an ELISA plate 
5 spectrophotometry reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. Alter- 
natively, 30 microlitres of IN HCl are placed in each well 
to stop the reaction, and the plates are read in the 
spectrophotometer within 24 hours. 
10 The serological samples are introduced in dupli- 

cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 

the same dilution. 
15 T he net OD of each serum corresponds to the mean 

OD of the serum minus the mean OD of the negative control 
(solution B: PBS , 0.05% Tween 20®, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies by 

ELISA: 

20 T he technique described above was used with the 

POLB2 peptide to test for the presence of anti-MSRV-1 
specific. IgG antibodies in the serum of 29 patients for 
whom a definite or probable diagnosis of MS was estab- 
lished according to the criteria of Poser (23) , and of 32 

25 healthy controls (blood donors) . 

Figure 29 shows the results for each serum 
tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 

30 top of the vertical bars. The first 29 vertical bars lying 
to the left of the vertical broken line represent the sera 
of 29 cases of MS tested, and the 32 vertical bars lying 
to the right of the vertical broken line represent the 
sera of 32 healthy controls (blood donors) . 

35 The mean of the net OD values for the MS sera 

tested is 0.62. The diagram enables 5 controls to be 
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revealed whose net OD rises above the grouped values of 
the control population. These values may represent the 
presence of specific IgGs in symptomless seropositive 
patients. Two methods were hence evaluated in order to 
determine the statistical threshold of positivity of the 
test. 

The mean of the net OD values for the controls, 
including the controls with high net OD values, is 0.36. 
Without the 5 controls whose net OD values are greater 
than or equal to 0.5, the mean of the "negative- controls 
is 0.33. The standard deviation of the negative controls 
is 0.10. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
seronegative controls) + (2 or 3 x standard deviation of 
the net OD values of the seronegative controls) . 

In the first case, there are considered to be 
symptomless seropositives, and the threshold value is 
equal to 0.33 + (2 x 0.10) - 0.53. The negative results 
represent a non-specific "background" of the presence of 
antibodies directed specifically against an epitope of the 
peptide. 

in the second case, if the set of controls 
consisting of blood donors in apparent good health is 

25 taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.36 + (2 x 0.116) = 0.59. 

According to this analysis, the test is specific 

30 for MS. in this respect, it is seen that the test is 
specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 

35 healthy controls who have been in contact with MSRV-1. 
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In accordance with the first method of calcula- 
tion, and as shown in Figure 29 and in the corresponding 
Table 1, 26 of the 29 MS sera give a positive result (net 
OD greater than or equal to 0.50), indicating the presence 
of IgGs specifically directed against the POL2B peptide, 
hence against a portion of the reverse transcriptase 
enzyme of the MSRV-1 retrovirus encoded by its pol gene, 
and consequently against the MSRV-1 retrovirus. Thus, 
approximately 90% of the MS patients tested have reacted 
against an epitope carried by the POL2B peptide and 
possess circulating IgGs directed against the latter. 

Five out of 32 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 
approximately 15% of the symptomless population may have 
15 been in contact with an epitope carried by the P0L2B 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 
an immunization against the MSRV-1 retrovirus reverse 
20 transcriptase during an infection with (and/or reactiva- 
tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological, pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 
25 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 
from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it_ is hence normal, for 
controls taken from a healthy population to possess IgG 
type antibodies against components of the MSRV-1 
retrovirus. Thus, the difference in seroprevalence between 
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the MS and control populations is extremely significant: 
"chi-squared" test, p < 0.001. These results hence point 
to an aetiopathogenic role of MSRV-1 in MS. 

d) Detection of anti-MSRV-1 IgM antibodies by 

5 ELISA: 

The ELISA technique with the POL2B peptide was 
used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the serum of 36 patients for whom a definite 
or probable diagnosis of MS was established according to 
10 the criteria of Poser (23), and of 42 healthy controls 

(blood donors) . 

Figure 30 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 
the net optical density (OD at 492 nm) of a serum tested. 

15 The ordinate axis gives the net OD at the top of the 
vertical bars. The first 3 6 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 36 cases of MS tested, and the 
vertical bars lying to the right of the vertical broken 

20 line represent the sera of 42 healthy controls (blood 
donors). The horizontal line drawn in the middle of the 
diagram represents a theoretical threshold defining the 
boundary of the positive results (in which the top of the 
bar lies above) and the negative results (in which the top 

25 of the bar lies below) . 

The mean of the net OD values for the MS cases 

tested is 0.19. 

The mean of the net OD values for the controls 

is 0.09. 

30 T he standard deviation of the negative controls 

is 0.05. 

In view of the small difference between the mean 
and the standard deviation of the controls, the threshold 
of theoretical positivity may be calculated according to 
35 the formula: 
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threshold value = (mean of the net OD values of 
the seronegative controls) + (3 x standard deviation of 
the net OD values of the seronegative controls) . 

5 The threshold value is hence equal to 0.09 + 

(3 x 0.05) = 0.26; or, in practice, 0.25. 

The negative results represent a non-specific 
••background" of the presence of antibodies directed 
specifically against an epitope of the peptide. 

10 According to this analysis, and as shown in 

Figure 3 0 and in the corresponding Table 2, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 7 of the 3 6 MS sera produce a positive IgM 
result; now, a study of the clinical data reveals that 

15 these positive sera were taken during a first attack of MS 
or an acute attack in untreated patients. It is known that 
IgMs directed against pathogenic agents are produced 
during primary infections or during reactivations follow- 
ing a latency phase of the said pathogenic agent. 

20 T he difference in seroprevalence between the MS 

and control populations is extremely significant: 
••chi-squared" test, p < 0.001. 

These results point to an aetiopathogenic role 

of MSRV-1 in MS. 

25 The detection of IgM and IgG antibodies against 

the POL2B peptide .enables the course of an MSRV-1 infec- 
tion and/or of the viral reactivation of MSRV-1 to be 
evaluated . 
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e) Search for immunodominant epitopes in the 

P0L2B peptide: 

In order to reduce the non-specific background 
and to optimize the detection of the responses of the 
5 anti-MSRV-1 antibodies, the synthesis of octapeptides , 
advancing in successive one amino acid steps, covering the 
whole of the sequence determined by POL2B, was carried out 
according to the protocol described below. 

The chemical synthesis of overlapping octapep- 
10 tides covering the amino acid sequence 61-110 shown in the 
identifier SEQ ID NO: 39 was carried out on an activated 
cellulose membrane according to the technique of BERG et 
al (1989. J. Ann. Chem. Soc. , ill, 8024-8026) marketed by 
Cambridge Research Biochemicals under the trade name 
15 Spotscan. This technique permits the simultaneous 
synthesis of a large number of peptides and their 

analysis. . . 

The synthesis is carried out with esterxfxed 

amino acids in which the a-amino group is protected with 

20 an FMOC group (Nova Biochem) and the side-chain groups 

with protective groups such as trityl, t-butyl ester or t- 

butyl ether. The esterified amino acids are solubilized xn 

N-methylpyrrolidone (NMP) at a concentration of 300 nM, 

and 0.9 ml are applied to spots of deposit of bromophenol 

25 blue. After incubation for 15 minutes, a further 

application of amino acids is carried out according to 

another 15-minute incubation. If the coupling between two 

amino acids has taken place correctly, a coloration 

modification (change from blue to yellow-green) xs 

30 observed. After three washes in DMF, an acetylation step 

is performed with acetic anhydride. Next, the terminal 

amino groups of the peptides in the process of synthesis 

are deprotected with 20% pyridine in DMF. The spots of 

deposit are restained with a 1% solution of bromophenol 

35 blue in DMF, washed three times with methanol and dried. 

This set of operations constitutes one cycle of addxtxon 
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of an amino acid, and this cycle is repeated until the 
synthesis is complete. When all the amino acids have been 
added, the NH2 -terminal group of the last amino acid is 
deprotected with 20% piper idine in DMF and acetylated with 
5 acetic anhydride. The groups protecting the side chain are 
removed with a dichloromethane/trif luoroacetic 

acid/triisobutylsilane (5 ml/5 ml/250 ml) mixture. The 
immunoreactivity of the peptides is then tested by ELISA. 

After synthesis of the different octapeptides in 

10 duplicate on two different membranes, the latter are 
rinsed with methanol and washed in TBS (0.1M Tris pH 7.2), 
then incubated overnight at room temperature in a 
saturation buffer. After several washes in TBS-T (0.1M 
Tris pH 7.2 - 0.05% Tween 20), one membrane is incubated 

15 with a 1/50 dilution of a reference serum originating from 
a patient suffering from MS, and the other membrane with a 
1/50 dilution of a pool of sera of healthy controls. The 
membranes are incubated for 4 hours at room temperature. 
After washes with TBS-T, a p-galactosidase-labelled anti- 

20 human immunoglobulin conjugate (marketed by Cambridge 
Research Biochemicals) is added at a dilution of 1/200, 
and the mixture is incubated for two hours at room 
temperature. After washes of the membranes with 0.05% TBS- 
T and PBS, the immunoreactivity in the different spots is 

25 visualized by adding 5-bromo-4-chloro-3-indolyl p-D- 
galactopyranoside in potassium. The intensity of 
coloration of the spots is estimated qualitatively with a 
relative value from 0 to 5 as shown in the attached 

Figures 31 to 33. 

30 m this way, it is possible to determine two 

immunodominant regions at each end of the POL2B peptide, 
corresponding, respectively, to the amino acid sequences 
65-75 (SEQ ID N0:41) and 92-109 (SEQ ID NO:42), according 
to Figure 34, and lying, respectively, between the 

35 octapeptides Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp (FCIPVRPD) 
and Arg-Pro-Asp-Ser-Gln-Phe-Leu-Phe (RPDSQFLF) , and 
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Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg (TVLPQGFR) and Leu-Phe- 
Gly-Gln-Ala-Leu-Ala-Gln (LFGQALAQ) , and a region which is 
less reactive but apparently more specific, since it does 
not produce any background with the control serum, 
5 represented by the octapeptides Leu-Phe-Ala-Phe-Glu-Asp- 
Pro-Leu (LFAFEDPL) (SEQ ID NO: 43) and Phe-Ala-Phe-Glu-Asp- 
Pro-Leu-Asn (FAFEDPLN) {SEQ ID NO: 44). 

These regions make it possible to define new 
peptides which are more specific and more immunoreactive 
10 according to the usual techniques. 

It is thus possible, as a result of the 
discoveries made and the methods developed by the inven- 
tors, to carry out a diagnosis of MSRV-1 infection and/or 
reactivation and to evaluate a therapy in MS on the basis 
15 of its efficacy in "negativing" the detection of these 
agents in the patients' biological fluids. Furthermore, 
early detection in individuals not yet displaying neuro- 
logical signs of MS could make it possible to institute a 
treatment which would be all the more effective with 
20 respect to the subsequent clinical course for the fact 
that it would precede the lesion stage .which corresponds 
to the onset of neurological disorders. Now, at the 
present time, a diagnosis of MS cannot be established 
before a symptomatology of neurological lesions has set 
25 in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions of 
the central nervous system which are already significant. 
The diagnosis of an MSRV-1 and/or MSRV-2 infection and/or 
reactivation in man is hence of decisive importance, and 
30* the present invention provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
evaluate a therapy in MS on the basis of its efficacy in 
"negativing" the detection of these agents in the 
35 patients' biological fluids. 
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EXAMPLE 12: OBTAINING A CLONE LB19 CONTAINING A 
PORTION Of^THE gag GENE OF THE MSRV-1 RETROVIRUS 

A PCR technique derived from the technique 
published by Gonzalez-Quintial R et al. (19) and PLAZA et 
al. (25) was used. From the total RNAs extracted from a 
fraction of virion purified as described above, the cDNA 
was synthesized using a specific primer (SEQ ID No. 64) at 
the 3' end of the genome to be amplified, using EXPAND™ 
REVERSE TRANSCRIPTASE (BOEHRINGER MANNHEIM) . 

cDNA: 

AAGGGGCATG GACGAGGTGG TGGCTTATTT (SEQ ID NO: 65) 
(antisense) 

After purification, a poly (G) tail was added at 
the 5' end of the cDNA using the "Terminal transferases 
kit" marketed by the company Boehringer Mannheim, 
according to the manufacturer's protocol. 

An anchoring PCR was carried out using the 

20 following 5' and 3' primers: 

AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC (SEQ ID No. 91) 
(sense), and AAATGTCTGC GGCACCAATC TCCATGTT 

(SEQ ID No. 64) (antisense) 

Next, a semi-nested anchoring PCR was carried 
25 out with the following 5" and 3 ' primers: 

AGATCTGCAG AATTCGATAT CA (SEQ ID No. 92) (sense), and 

AAATGTCTGC GGCACCAATC TCCATGTT (SEQ ID No. 64) (antisense) 

The products originating from the PCR were 
purified after purification on agarose gel according to 
conventional methods (17), and then resuspended in 
10 microlitres of distilled water. Since one of the 
properties of Tag polymerase consists in adding an adenine 
at the 3' end of each of the two DNA strands, the DNA 
obtained was inserted directly into a plasmid using the TA 
Cloning™ kit (British Biotechnology). The 2 Ml of DNA 
solution were mixed with 5 M l of sterile distilled water, 
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1 jul of 10-fold concentrated ligation buffer "lOx LIGATION 
BUFFER" / 2 Ml of "pCR™ VECTOR" (25 ng/ml) and 1 Ml of "T4 
DNA LIGASE" . This mixture was incubated overnight at 12 °C. 
The following steps were carried out according to the 
5 instructions of the TA Cloning™ kit (British 
Biotechnology). At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 
10 "miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 
15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™ . The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 
20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 
25 pgr amplification according to the technique 

mentioned above was used on a cDNA synthesized from the 
nucleic acids of fractions of infective particles purified 
on a sucrose gradient, according to the technique 
described by H. Perron (13), from culture supernatants of 
30 B lymphocytes of a patient suffering from MS, immortalized 
with Epstein-Barr virus (EBV) strain B95 and expressing 
retroviral particles associated with reverse transcriptase 
activity as described by Perron et al. (3) #nd in French 
Patent Applications MS 10, 11 and 12. the clone LB19, 
35 whose sequence, identified by SEQ ID NO: 59, is presented 
in Figure 35. 
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The clone makes it possible to define, with the 
clone GM3 previously sequenced and the clone G+E+A (see 
Example 15), a region of 690 base pairs representative of 
a significant portion of the gag gene of the MSRV-1 
5 retrovirus, as presented in Figure 36. This sequence 
designated SEQ ID NO: 88 is reconstituted from different 
clones overlapping at their ends. This sequence is 
identified under the name MSRV-1 "gag*" region. In Figure 
36, a potential reading frame with the translation into 
10 amino acids is presented below the nucleic acid sequence. 

EXAMPLE 13: OBTAINING A CLONE FBdl3 CONTAINING A 
pol GENE REGION RELATED TO THE MSRV-1 RETROVIRUS AND AN 
APPARENTLY INCOMPLETE ENV REGION CONTAINING A POTENTIAL 
15 READING FRAME (ORF) FOR A GLYCOPROTEIN 

Extraction of viral RNAs: The RNAs were 
extracted according to the method briefly described below. 

A pool of culture supernatant of B lymphocytes 
of patients suffering from MS (650 ml) is centrifuged for 
20 30 minutes at 10,000 g. The viral pellet obtained is 
resuspended in 300 microlitres of PBS/10 mM MgCl 2 . The 
material is treated with a DNAse (100 mg/ml)/RNAse 
(50 mg/ml) mixture for 30 minutes at 37'C and then with 
proteinase K (50 mg/ml) for 30 minutes at 46°C. 
25 The nucleic acids are extracted with one volume 

of a phenol/0.1% SDS (V/V) mixture heated to 60-C, and 
then re-extracted with one volume of phenol/ chloroform 
(l:l; V/V). 

Precipitation of the material is performed with 
30 2.5V of ethanol in the presence of 0.1 V of sodium 
acetate pH5.2. The pellet obtained after centrif ugation is 
resuspended in 50 microlitres of sterile DEPC water. 

The sample is treated again with 50 mg/ml of 
"RNAse free" DNAse for 30 minutes at room temperature, 
35 extracted with one volume of phenol/chloroform and 
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precipitated in the presence of sodium acetate and 
ethanol . 

The RNA obtained is quantified by an OD reading 
at 260 nm. The presence of MSRV-1 and the absence of DNA 
5 contaminant is monitored by a PCR and an MSRV-1 -specific 
RTPCR associated with a specific ELOSA for the MSRV-1 
genome . 

Synthesis of cDNA: 

5 mg of RNA are used to synthesize a cDNA primed 
10 with a poly(DT) oligonucleotide according to the 
instructions of the •* cDNA Synthesis Module" kit (ref 
RPN 1256, Amersham) with a few modifications: The reverse 
transcription is performed at 45 °C instead of the 

recommended 42 °C. 
15 Tne synthesis product is purified by a double 

extraction and a double purification according to the 

manufacturer's instructions. 

The presence of MSRV-1 is verified by an MSRV-1 

PCR associated with a specific ELOSA for the MSRV-1 
20 genome. 

"Long Distance PCR": (LD-PCR) 

500 ng of cDNA are used for the LD-PCR step 
(Expand Long Template System; Boehringer (ref. 1681 842)). 

Several pairs of oligonucleotides were used. 
25 Among these, the pair defined by the following primers: 
5' primer: GGAGAAGAGC AGCATAAGTG G (SEQ ID NO: 66) 
3' primer: GTGCTGATTG GTGTATTTAC AATCC (SEQ ID NO: 67). 

The amplification conditions are as follows: 
94 °C 10 seconds 
30 56°C 30 seconds 

68 °C 5 minutes; 
10 cycles, then 20 cycles with an increment of 
24) seconds in each cycle on the elongation^ time. *t,th| 
end of this first amplification, 2 microlitres of the 
35 amplification product are subjected to a second 
amplification under the same conditions as before. 
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The L.D-PCR reactions are conducted in a Perkin 
model 9600 PCR apparatus in thin-walled microtubes 

(Boehringer) . 

The amplification products are monitored by 
5 electrophoresis of l/5th of the amplification volume 
(10 microlitres) in 1% agarose gel. For the pair of 
primers described above, a band of approximately 1.7 Kb is 
obtained. 

Cloning of the amplified fragment: 
0 The PCR product was purified by passage through 

a preparative agarose gel and then through a Cpstar column 
(Spin; D. Dutcher) according to the supplier's 

instructions . 

2 microlitres of the purified solution are 

L5 joined up with 50 ng of vector PCRII according to the 

supplier's instructions (TA Cloning Kit; British 

Biotechnology) ) . 

The recombinant vector obtained is isolated by 
transformation of competent DH5<xF ' bacteria. The bacteria 
are selected using their resistance to ampicillin and the 
loss of metabolism for Xgal (= white colonies). The 
molecular structure of the recombinant vector is confirmed 
by plasmid minipreparation and hydrolysis with the enzyme 
EcoRl . 

FBdl3 , a positive clone for all these criteria, 
was selected. A large-scale preparation of the recombinant 
plasmid was performed using the Midiprep Quiagen kit (ref 
12243) according to the supplier's instructions. 

Sequencing of the clone FBdl3 is performed by 
means of the Perkin Prism Ready Amplitaq FS dye terminator 
kit (ref. 402119) according to the manufacturer's 
instructiions. The sequence reactions are introduced into 
a Perkin type 377 or 373 A automatic sequencer. fl!he 
' sequencing strategy consists in gene walking carried out 
35 on both strands of the clone Fbdl3. 
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~* rione FBdl3 is identified by 

The sequence of the clone rouu 

SEQ ID NO: 58. 

in Figure 37, the sequence homology between the 
clone FBdl3 and the HSERV-9 retrovirus is shown on the 
5 m atrix chart by a continuous line for any ^^^^ 
greater than or equal to 70%. It can be seen that there 
Ire homologies in the flawing regions of the clone (with 
the pol gene at the 5' end and with the env gene and then 
the LTR at the 3- end), but that the internal region is 
10 totally divergent and does not display any homology even 
weak, with the env gene of HSERV-9 . Furthermore, it xs 
apparent that the clone FBdl3 contains a longer -env 
region than the one which is described for the defective 
exogenous HSERV-9 ; it may thus be seen that the internal 
15 Urgent region constitutes an "insert" between the 
regions of partial homology with the HSERV-9 defectxve 

9SneS ' This additional sequence determines a potential 
orf, designated ORF B13 , which is represented by its amino 

20 acid sequence SEQ ID NO: 87. 

The molecular structure of the clone FBdl3 was 
analysed usin, the GeneWork software and GenebanR and 

SwissProt data banks. 

5 glycosylation sites were found. 
25 The protein does not have significant homology 

with already known sequences. 

It is probable that this clone originates from a 
recombination of an endogenous retroviral element (ERV) , 
linked to the replication of MSRV-1. 

such a phenomenon does not lack generate of 
the expression of polypeptides, or even of endogenous 
retroviral proteins which are not necessarily tolerated by 
the Immune system. Such, scheme of aberrant express ibn *f 
endogenous elements related to MSRV-1 and/or induced by 
„ th! Tatter is liable to multiply the aberrant antigens, 
" "d hence tends to contribute to the induction of 
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autoimmune processes such as are observed ln 
clearxy constitutes a novel element never hitherto 
I scribed. I. effect, interrogation of the data banKs o 
nucleic acid seances available in version Ho. 19 
5 of the "Entres" software (HCBI . NIH. Bethesda , USA) did 
not enable a Known homologous sequence comprise the 
whole of the env region of this clone to be identified. 

EXAMPLE 14= OBTAINING A CLONE FP6 CONTAINING A 

, finup WTTH A REGION CODING FOR THE 
10 PORTION OF THE pol GENE, WITH » 

REVERSE TRANSCRIPTASE ENZYME HOMOLOGOUS TO THE CLO^ POL- 
KSRV-1. AND A 3'pOl REGION DIVERGENT FROM THE EQUIVALENT 
SESDENCES DESCRIBED IN THE CLONES POL., tpol, FBd3, OLBCl 
and JLBC2 ^ ^ s per£orned ^ Mh e 

from plasma of a patient suffering from MS. A healthy 
con«ol Piasma treated under the same conditions was used 
as negative control. The synthesis of cDNA was carried out 
with the following modified oligo(dT) primer: 

™^ .^TrrlTTT TTTTTTTTTT TTTT 3' (SEQ ID N0:68) 

20 5' GACTCGCTGC AGATCGATTT 11""" . 

and Boehringer "Expand RT» reverse transcriptase 
according to the conditions recommended by the company. A 
PCR was performed with the enzyme Klentag (Clontech, under 
Le following conditions: 94-C S min then 93-C 1 min, SBC 
25 fmin. 68-0 3 min for 40 cycles and es-C for S mm. and 
with a final reaction volume of 50 pi • 
Primers used for the PCR: 
- 5. primer, identified by SEQ ID NO:69 
5 . GCCATCAAGC CACCCAAGAA CTCTTAACTT 3'; 
30 . 3. primer, identified by SEQ ID N0:68 (=the 

same as for the cDNA) 

A second, so-called "semi -nested" PGR was 
carried out with a -V pri-t located within the regxon 
already amplified. This second PCR was performed under the 
35 same experimental conditions as those used xn the fxrst 
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PCR, using 10 til of the amplification product originating 

from the first PCR. 

Primers used for the semi-nested PCR: 

- 5' primer, identified by SEQ ID NO: 70 
5 5« CCAATAGCCA GACCATTATA TACACTAATT 3'; 

- 3« primer, identified by SEQ ID NO: 68 (=the 

same as for the cDNa) 

Primers SEQ ID NO: 69 and SEQ ID NO: 70 are 
specific for the pol* region: position No. 403 to No. 422 
10 and No. 641 to No. 67 0, respectively. 

An amplification product was thus obtained from 
the extracellular RNA extracted from the plasma of a 
patient suffering from MS. The corresponding fragment was 
not observed for the plasma of the healthy control. This 
15 amplification product was cloned in the following manner. 

The amplified DNA was inserted into a plasmid 
using the TA Cloning™ kit. The 2 ul of DNA solution were 
mixed with 5 /il of sterile distilled water, 1 Ml of a 
10-fold concentrated ligation buffer "lOx LIGATION 
20 BUFFER", 2/il of "pCR™ VECTOR" (25ng/ml) and l/il of 
"TA DNA LIGASE". This mixture was incubated overnight at 
12 »C. The following steps were carried out according to 
the instructions of the TA Cloning™ kit (British 
Biotechnology). At the end of the procedure, the white 
25 columns of recombinant bacteria (white) were picked out in 
order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
30 restriction enzyme and analyzed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide was selected for 
sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
35 cloning plasmid of the TA cloning kit™. The reaction prior 
to sequencing was then performed according to the method 
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recommended for the use of the sequencing Kit -Pri« ready 

reaction kit dye deoxyterminator cycle sequencxng kit 

(ApP lied Biosystems, ref. 401384), and automatxc 

<™ was carried out with an Applied Biosystems 
sequencing was carrxeu «uu 

»n«ri£»i -*7 3 a" apparatus according to 

5 "Automatic Sequencer, model 373 A a PP <* 

the manufacturer's instructions. _ 

The clone obtained, designated FP6, enables a 
region of 467 bp which is 89% homologous to the pol* 
region of the MSRV-1 retrovirus and a regxon * "67* 
10 which is 64% homologous to the pol regxon of 9 
(No. 1634 to 2856) to be defined. ^ 
The clone FP6 is represented in Figure 38 by xt. 

« i/i-ni-ified bv SEQ ID NO: 61. The three 
nucleotide sequence xdentxfxed ay Air .^ A tev 

potential reading frames of this clone are xndxcated by 
15 their amino acid sequence under the nucleotide sequence. 

EXAMPLE 15: OBTAINING A REGION DESIGNATED G+E+A 
CONTAINING AN ORF FOR A RETROVIRAL PROTEASE, BY PCR 
AMPLICATION OF THE NUCLEIC ACID SEQUENCE CONTAINED 
20 BETWEEN ^HE —ON DEFINED B* THE - 

3 . REGION DEFINED BV THE CLONE POL* * FROM THE 
Ltra^ted FROM A POOL OF PLASMAS OF patients SUFFERING 

FROMMS oligonucleotides specific for the MSRV-1 
25 sequences already identified by the Applicant were defined 
in order to amplify the retroviral RNA originatxng from 
virions present in the plasma of patients sufferxng from 
MS. ContLl reactions were performed so as to monxtor the 
oresence of contaminants (reaction wxth water). The 
presence or RT _ PC R followed by a 

in amplification consxsts of a step oj. 

"nLted" PCK. Pairs of priMers were defined for amplifying 
Tee overlapping regions (designated o « «- *> ™ £ 
regions defined by the sequences of the clones GM3 and 
pol* described above. 
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Semi-nested BT-PCR for amplification of the region 0, 
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- in the first RT-PCR cycle, the following 

primers are used: 

primer 1: SEQ ID NO: 71 (sense) 
primer 2: SEQ ID NO: 72 (antisense) 

- in the second PCR cycle, the following primers 

are used: 

primer 1: SEQ ID NO: 73 (sense) 
primer 4: SEQ ID NO: 74 (antisense) 
Nested RT-PCR for amplification of the region E: 

- in the first RT-PCR cycle, the following 

primers are used: 

primer 5: SEQ ID NO: 75 (sense) 
primer 6: SEQ ID NO: 76 (antisense) 

- in the second PCR cycle, the following primers 

15 are used: 

primer 7: SEQ ID NO: 77 (sense) 
primer 8: SEQ ID NO: 78 (antisense) 
Semi-nested RT-PCR for amplification of the region A: 

- in the first RT-PCR cycle, the following 

20 primers are used: 

primer 9: SEQ ID NO: 79 (sense) 
primer 10: SEQ ID NO: 80 (antisense) 

- in the second PCR cycle, the following primers 

are used: 

25 primer 9: SEQ ID NO: 81 (sense) 

primer 11: SEQ ID NO: 82 (antisense) 

The primers and the regions G, E and A which 

they define are positioned as follows: 

cDNA 

30 i G £ 2 

5 7 E & 6 

3 A 11 10 
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The sequence of the region defined by the 
different clones G, E and A was determined after cloning 
and sequencing of the "nested" amplification products. 

The clones G, E and A were assembled together by 
5 PGR with the primers 1 at the 5' end of the fragment G and 
11 at the 3' end of the fragment A, the primers being 
described above. An approximately 1580-bp fragment G+E+A 
was amplified and inserted into a plasmid using the TA 
Cloning (trademark) kit. The sequence of the amplification 
10 product corresponding to G+E+A was determined and analysis 
of the G+E and E+A overlaps was carried out. The sequence 
is shown in Figure 39, and corresponds to the sequence SEQ 
ID NO: 89. 

A reading frame coding for an MSRV-1 retroviral 
15 protease was found in the region E. The amino acid 
sequence of the protease, identified by SEQ ID NO:90, is 
presented in Figure 40. 

EXAMPLE 16: OBTAINING A CLONE LTRGAG12, RELATED 
20 TO AN ENDOGENOUS RETROVIRAL ELEMENT (ERV) CLOSE TO MSRV-1, 
IN THE DNA OF AN MS LYMPHOBLASTOID LINE PRODUCING VIRIONS 
AND EXPRESSING THE MSRV-1 RETROVIRUS 

A nested PCR was performed on the DNA extracted 
from a lymphoblastoid line (B lymphocytes immortalized 
25 with the EBV virus strain B95, as described above and as 
is well known to a person skilled in the art) expressing 
the MSRV-1 retrovirus and originating from peripheral 
blood lymphocytes of a patient suffering from MS. 

In the first PCR step, the following primers are 

30 used: 

primer 4327: CTCGATTTCT TGCTGGGCCT TA (SEQ ID NO:83) 
primer 3512: GTTGATTCCC TCCTCAAGCA (SEQ ID NO:84) 

-.This step comprises 35 amplification cycles with 
the following conditions: 1 min at 94»C, 1 min at 54-c and 
35 4 min at 72°C. 
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In the second PCR step, the following primers 

are used: 

primer 4294: CTCTACCAAT CAGCATGTGG (SEQ ID NO: 85) 
primer 3591: TGTTCCTCTT GGTCCCTAT (SEQ ID NO: 86) 

5 This step comprises 35 amplification cycles with 

the following conditions: 1 min at 94-C, 1 min at 54»C and 

4 min at 72°C. 

The products originating from the PCR were 
purified after purification on agarose gel according to 

10 conventional methods (17), and then resuspended in 10ml 
of distilled water. Since one of the properties of Tag 
polymerase consists in adding an adenine at the 3- end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 

15 Biotechnology) . The 2 Ml of DNA solution were mixed with 

5 M l of sterile distilled water, 1 Ml of a 10-fold 
concentrated ligation buffer -10X LIGATION BUFFER", 2 M l 
of "pCR™ VECTOR" (25 ng/ml) and 1 Ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 »C. The 

20 following steps were carried out according to the 
instructions of the TA Cloning*** kit (British 
Biotechnology). At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 

25 plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analyzed on agarose gel. The 
plasmids possessing an insert detected under UV light 

30 after staining the gel with ethidium bromide were selected 
for sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 

35 recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
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(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 
5 Thus, a clone designated LTRGAG12 could be 

obtained, and is represented by its internal sequence 
identified by SEQ ID NO: 60. 

This clone is probably representative of 
endogenous elements close to ERV-9, present in human DNA, 

10 in particular in the DNA of patients suffering from MS, 
and capable of interfering with the expression of the 
MSRV-1 retrovirus, hence capable of having a role in the 
pathogenesis associated with the MSRV-1 retrovirus and 
capable of serving as marker for a specific expression in 

15 the pathology in question. 

EXAMPLE 17: DETECTION OF ANTI-MSRV-X SPECIFIC 

ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 
of the MSRV-1 retrovirus and of an open reading frame of 
this gene enabled the amino acid sequence SEQ ID NO: 63 of 
a region of the said gene, referenced SEQ ID NO: 62, to be 
determined . 

Different synthetic peptides corresponding to 
25 fragments of the protein sequence of MSRV-1 reverse 
transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
30 solid-phase synthesis according to the Merrifield tech- 
nique (22). The practical details are those described 
below. 

a) Peptide synthesis: 

The peptides were synthesized on a phenylacet- 
35 amidomethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
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"Applied Biosystems 430A" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 
(HOBT) esters. The amino acids used are obtained from 
Novabiochem (LSuf lerlf ingen, Switzerland) or Bachem 
5 (Bubendorf, Switzerland). 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 
solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 
10 hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Instiute, Osaka, Japan). 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 
anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2°C. The HF is then 
15 evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

The peptides are purified by preparative high 
performance liguid chromatography on a VYDAC CIS type 
column (250 x 21 mm) (The separation Group, Hesperia, CA, 
USA) . Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
VYDAC™ C18 analytical column (250 x 4.6 mm) at a flow rate 
of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liguid 
chromatography with the system described above. The 
peptide which is considered to be of acceptable purity 
30 manifests itself in a single peak representing not less 
than 95% of the chromatogr am . 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using 
an Applied Biosystems 420H automatic amino acid analyser. 
35 Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
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the positive ion mode on a VG. ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2 000 acquisition system 
(VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated S24Q to be selected, whose sequence is 
identified by SEQ ID NO: 63, encoded by a nucleotide 
sequence of the pol gene of MSRV-1 (SEQ ID NO: 62) . 



b) Antigenic properties: 

The antigenic properties of the S24Q peptide 
were demonstrated according to the ELISA protocol 

described below. 

The lyophilized S24Q peptide was dissolved in 
10 % acetic acid at a concentration of 1 mg/ml. This stock 
solution was aliquoted and kept at +4*C for use over a 
fortnight, or frozen at -20°C for use within 2 months. An 
aliquot is diluted in PBS (phosphate buffered saline) 
solution so as to obtain a final peptide concentration of 
5 micrograms/ml. 100 microlitres,. of this dilution are 
placed in each well of Nunc Maxisorb (trade name) 
microtitration plates. The plates are covered with a 
••plate-sealer" type adhesive and kept for 2 hours at +37 «C 
25 for the phase of adsorption of the peptide to the plastic. 
The adhesive is removed and the plates are washed three 
times with a volume of 300 microlitres of a solution A 
(IX' PBS, 0.05% Tween 20®), then inverted over an 
absorbent tissue. The plates thus drained are filled with 
250 microlitres per well of a solution B (solution A + 10% 
of goat serum), then covered with an adhesive and 
incubated for 1 hour at 37»C. The plates are then washed 
three times with the solution A as described above. 

The test serum samples are diluted beforehand to 
35 1/100 in the solution B, and 100 microlitres of each 
dilute test serum are placed in the wells of each micro- 
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titration plate. A negative control is placed in one well 
of each plate, in the form of 100 microlitres of buffer B. 
The plates covered with an adhesive are then incubated for 
1 hour 30 min at 37 °C. The plates are then washed three 
times with the solution A as described above. For the igG 
response, a peroxidase-labelled goat antibody directed 
against human IgG (marketed by Jackson Immuno Research 
inc.) is diluted in the solution B (dilution . 1/10,000) . 
100 microlitres of the appropriate dilution of the 
labelled antibody are then placed in each well of the 
microtitration plates, and the plates covered with an 
adhesive are incubated for 1 hour at 37 -C. A further 
washing of the plates is then performed as described 
above, in parallel, the peroxidase substrate is prepared 
15 according to the directions of the bioMerieux kits. 100 
microlitres of substrate solution are placed in each well, 
and the plates are placed protected from light for 20 to 
30 minutes at room temperature. 

When the colour reaction has stabilized, 
50 microlitres of Color 2 (bioMerieux trade name) are 
placed in each well in order to stop the reaction. The 
plates are placed immediately in an ELISA plate 
spectrophotometric reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. 

The serological samples are introduced in dupli- 
cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 

the same dilution. 

The net OD of each serum corresponds to the mean 
OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20x, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies 

(S24Q) by ELISA: 

35 The technique described above was used with the 

S24Q peptide to test for the presence of anti-MSRV-1 
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specific IgG antibodies in the serum of 15 patients for 
whom a definite diagnosis of MS was established according 
to the criteria of Poser (23), and of 15 healthy controls 
(blood donors) . 

5 Figure 41 shows the results for each serum 

tested with an anti-lgG antibody. Each vertical bar 
represents the net optical density (OD at 492 rati) of a 
serum tested. The ordinate axis gives the net OD at the 
top of the vertical bars. The first 15 vertical bars lying 

10 to the left of the vertical broken line represent the sera 
of 15 healthy controls (blood donors) , and the 15 vertical 
bars lying to the right of the vertical broken line 
represent the sera of 15 cases of MS tested. The diagram 
enables 2 controls to be revealed whose OD rises above the 

15 grouped values of the control population. These values may 
represent the presence of specific IgGs in symptomless 
seropositive patients. Two methods were hence evaluated in 
order to determine the statistical threshold of positivity 
of the test. 

20 The mean of the net OD values for the controls, 

including the controls with high net OD values, is 0.129 
and the standard deviation is 0.06. Without the 2 controls 
whose OD values are greater than 0.2, the mean of the 
••negative" controls is 0.107 and the standard deviation is 

25 0.03. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net 0D values of the 
negative controls) + ( 2 or 3 ' standard deviation 
30 of the net OD values of the negative controls). 

In the first case, there are considered to be 
symptomless seropositives, and the threshold value is 
equal to 0.11 + (3 x 0.03) = 0.20. The negative results 
35 represent a non-specific "background" of the presence of 



antibodies directed specifically against an epitope of the 
peptide. 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 
taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.13 + (3 x 0.06) = 0.31. 

According to this latter analysis, the test is 
specific for MS. In this respect, it is seen that the test 
is specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 
healthy controls who have been in contact with MSRV-l. 

In accordance with the first method of calcula- 
tion, and as shown in Figure 41 and in Table 3, 6 of the 
15 MS sera give a positive result (OD greater than or 
equal to 0.2), indicating the presence of igGs 
specifically directed against the S24Q peptide, hence 
against a portion of the reverse transcriptase enzyme of 
the MSRV-l retrovirus encoded by its pol gene, and 
consequently against the MSRV-l retrovirus. 

Thus, approximately 40% of the MS patients 
tested have reacted against an epitope carried by the S24Q 
peptide and possess circulating IgGs directed against the 
latter . 

Two out of 15 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 
approximately 13% of the symptomless population may have 
been in contact with an epitope carried by the S24Q 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 
an immunization against the MSRV-l retrovirus reverse 
transcriptase during an infection with (and/or reactiva- 
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tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 
5 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 

10 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 

15 type antibodies against components of the MSRV-1 
retrovirus . 

Lastly, the detection of anti-S24Q antibodies in 
only one out of two MS cases tested here may reflect the 
fact that this peptide does not represent an 

20 immunodominant MSRV-1 epitope, that inter- individual 
strain variations may induce an immunization against a 
divergent peptide motif in the same region, or that the 
course of the disease and the treatments followed may 
modulate over time the antibody response against the S24Q 

25 peptide. 
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Mean 

Std. Dev. 
Threshold 
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CONTROLS 


MS 


0 . 101 


0 . 


136 


0 . 058 


0 . 


391 


0 . 126 


0. 


37 


0.131 


0 . 


119 


0 . 105 


0 . 


267 


0 .294 


0 . 


141 


0 . 116 


0. 


102 


0 . 088 


0. 


18 


0.1 05 


0 . 


411 


0 . 172 


0 . 


164 


0.137 


0. 


049 


0.223 


0 . 


644 


0 . 08 


0 . 


268 


0 .073 


0 . 


065 


0 .132 


0 . 


074 


0 . 129 






0.06 






t).31 







d) Detection of anti-MSRV-1 IgM antibodies by 

ELISA: 

20 T he ELISA technique with the S24Q peptide was 

used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the same sera as above. 

Figure 42 shows the results for each serum tested 
with an anti-igM antibody. Each vertical bar represents 

25 the net optical density (OD at 492 nm) of a serum tested. 
The ordinate axis gives the net OD at the top of the 
vertical bars. The first 15 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 15 healthy controls (blood donors), 

30 and the vertical -bars lying to the right of the vertical 
broken line represent the sera of 15 cases of MS tested. 

The mean of the OD values for the MS cases 

tested is 1.6. 

The mean of the net OD values for the controls 

35 is 0.7. 
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The standard deviation of the negative controls 

is 0.6. 

The threshold of theoretical positivity may be 
calculated according to the formula: 

threshold value = (mean of the OD values of the negative 

controls) + (3 x standard deviation of 
the OD values of the negative controls) 

The threshold value is hence equal to 0.7 + (3 x 0.6) = 
2.5; 

The negative results represent a non-specific 
"background- of the presence of antibodies directed 
specifically against an epitope of the peptide. 

According to this analysis, and as shown in 
Figure 42 and in the corresponding Table 4, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 6 of the 15 MS sera produce a positive IgM 
result 

The difference in seroprevalence between the MS 
and control populations is extremely significant: 
"chi-squared w test, p < 0.002. 

These results point to an aetiopathogenic role 

of MSRV-1 in MS. 

Thus, the detection of IgM and IgG antibodies 
against the S24Q peptide makes it possible to evaluate, 
alone or in combination with other MSRV-1 peptides, the 
course of an MSRV-1 infection and/or of the viral 
reactivation of MSRV-1. 
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TABLE NO. 4 



CONTROLS 


MS 


1.449 


0 . 974 


0.371 


6 . 117 


0.448 


2 .883 


0 .456 


1 . 945 


0.885 


1 . 787 


2.235 


0 . 273 


0.301 


1 . 766 


0.138 


0 . 668 


0.16 


2 . 603 


1 . 073 


0 .802 


1.366 


0.245 


0.283 


0 . 147 


0 .262 


2 .441 


0.585 


0.287 


0.356 


0 . 589 


0.7 




0.6 




2.5 





Mean 

Std . Dev . 

15 Threshold 

Value 

It is possible, as a result of the new 
discoveries made and the new methods developed by the 
inventors, to permit the improved implementation of 

20 diagnostic tests for MSRV-1 infection and/or reactivation 
and to evaluate a therapy in MS and/or RA on the basis of 
its efficacy in "negativing" the detection of these agents 
in the patient's biological fluids. Furthermore, early 
detection in individuals not yet displaying neurological 

25 signs of MS or rheumato logical signs of RA could make it 
possible to institute a treatment which would be all the 
more effective with respect to the subsequent clinical 
course for the fact that it would precede the lesion stage 
which corresponds to the onset of the clinical disorders. 

30 Now, at the present time, a diagnosis of MS or RA cannot 
be established before a symptomatology of lesions has set 
in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions 
which are already significant. The diagnosis of an MSRV-1 

35 and/or MSRV-2 infection and/or reactivation in man is 
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hence of decisive importance, and the present invention 
provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
5 evaluate a therapy in MS on the basis of its efficacy in 
"negativing" the detection of these agents in the 
patients' biological fluids. 

EXAMPLE 18 : 

10 1) MATERIALS AND METHODS 

- Patients and clinical samples 

Choroid plexus cells from MS patients and 
controls were obtained from the brain-cell library, 
Laboratoire R. Escourolles, H6pital de la Salpetriere, 

15 Paris, France. Non-tumor al leptomeningeal cells from 
controls were obtained as previously described (26) . 
Peripheral blood from MS and control patients used for 
obtaining B-cell lines and plasma, were obtained from the 
Neurological Departments, CHU de Grenoble, and from 

20 INSERM U 134, Hdpital de la Salpetriere, France. Clinical 
details and origin of the 10 MS patients and of the 10 
patients with other neurological diseases who provided CSF 
samples are given in Table 6. 

- Cell cultures, virus isolation and purification 

25 All cell-types were cultured as previously 

described (3 , 5, 26) . 

All cultures were regularly screened for mycoplasma 
contamination with an ELISA mycoplasma-detection kit 
(Boehringer) . No cell-extract nor supernatant used 

30 contained detectable mycoplasma. 

Extracellular virion purification and sucrose density 
gradients were performed as previously described (3, 5, 
26). From each sucrose gradient 0.5-lml fractions were 
collected from the top of the tubes, with a 1000jil 

35 Pipetman and a different sterile tip for each fraction. 
60M1 were used for RT activity assay and the rest was 
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mixed with 1 volume of buffer containing 4M guanidinium 
thiocyanate, 0.5% N-Lauroyl sarcosin, 25mM EDTA, 0.2% fl- 
mercaptoethanol adjusted at pH 5.5 with acetic acid. These 
mixtures were frozen at -80 °C for futher RNA extraction 
5 or directly processed according to Chomzynski (20), with 
an overnight precipitation step at -20»C, in presence of 
RNase-free glycogen (Boehringer) . RNA was dissolved 20 to 
50/Ltl of DEPC-treated water in the presence of 1-2/il of 
recombinant RNase-inhibitor (PROMEGA) and 0,lmM DTT. 10/xl 

10 aliquots were used for each RT-PCR. 
- Reverse transcriptase activity 

RT-activity was tested with 20mM Mg ++ and poly- 
Cm or polyc templates, in virion pellets or fractions from 
sucrose gradients as previously described (3, 5, 26). 

15 - cDNA synthesis and 'Pan-retro' RT-PCR with degenerate 
primers 

A total RT-activity between 10 6 -10 7 dpm was 
required in the fraction containing the peak of purified 
virions. The "Pan-retro" RT-PCR technique (27) was 

20 performed on virion RNA extracted by the method of 
Chomczynski (20) and dissolved in 20 nl RNase-free water. 
5 fil RNA solution was incubated for 30 min at 37°C with 
0.3 units (3 units for CSF series) of RNase-free DNase-1 
(Boehringer) in a 20 nl reaction containing 7.5 mM random 

25 hexamers, 5 mM Hepes-HCl pH 6.9, 75 mM KC1, 3 mM MgCl 2 , 10 
mM DTT, 50 mM Tris-HCl pH 7.5, 0.5 mM each dNTP, and 20 
units recombinant RNase inhibitor (Promega) . The DNase was 
then heat inactivated at 80 °C for 10 min. 20 units MoMLV 
RT (Pharmacia) and a further 20 units of RNase inhibitor 

30 were added to each tube in a Genesphere™ enclosure 
(Safetech, Ireland) and cDNA was synthesised for 90 min at 
37 °C. Following reverse transcription, the cDNA was boiled 
for 5 min then cooled rapidly on ice. The Round a PCR mix 
(final volume 25 /il P«r reaction; 20 mM Tris-HCl pH 8.4, 

35 60 mM KC1, 2.5 mM MgCl 2 , 200 ng each of primers PAN-UO and 
PAN-DI [see Figure 44], 0.2 mM each dNTP) was treated with 
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0.3 units DNase-1 and then heat inactivated as above. 
2 5 ul cDNA was added in the Genesphere™ enclosure and the 
tubes heated to 8 0 o C before adding 0.5 units Tag 
polymerase (Perkin Elmer) individually to each tube ("hot 
5 start"). Round 1 PGR parameters were 35 cycles of 95'C for 
1 min, 34-C for 30 sec, 72'C for 1 min, with a final 7 mm 
extension at 72-C. 0.5 ,1 of Round 1 PGR product was 
transferred to the Round 2 DNase-treated PCR mxx 
(composition as for Round 1 but containing primers PAN-UI 
10 and PAN-DI) using the "hot start" procedure. Round 2 PCR 
parameters. were as for Round 1 but using 30 cycles only 
and annealing at 45°C for 1 min. 

- cloning of PCR products 

PCR products were cloned using the TA-clonxng® 
15 kit (British Biotechnology) according to the 
manufacturer's recommendations. 

- sequencing . 

Sequencing reactions were performed usxng the 
"Prism ready reaction kit dye deoxyterminator cycle 
20 sequencing kit" (Applied Biosystems) . Automatic sequence 
analysis was performed on an automatic sequencer (Applxed 
Biosystems, 373 A) . 

- RT-PCR with ST1 primer sets 

The first PCR round was performed directly from the 
25 cDNA reaction mixture according to the one-step RT-PCR 
technique described by Mallet et al. (28). Thxs one-step 
RT-PCR procedure reduced the probabilxty of axrborne 
contamination when opening the tubes and transferring PCR 
reagents after an independent cDNA synthesxs. was 
30 extracted as previously from 2ml of plasma <™P^»« « 
liquid nitrogen and stored at -80-C) or from a 500 ,1 
sucrose fraction with a total RT-activity above io« dpm, 
and resuspended in 50 M l <* RNase-free ; water . For each RT- 
PCR reaction 10M1 of RNA solution was incubated xn « 

35 Perkin-Elmer 480 thermocycler, 15 min at 20'C wxth 1U of 
35 Perkxn timer buffer (5QmM 

RNase-free DNASE 1 and 1.2 Hi or 
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Tris, lOmM MgC12 and 0,lmM DTT) containing lU//xl of RNase- 
inhibitor (PROMEGA) , and heated at 70°C for 10 min for 
DNase inactivation. The solution was placed on ice and 
mixed (in conditions preventing airborne dust/DNA 
5 contamination) with 88 m1 of PCR mix containing: IX tag 
buffer, 25 nM/tube dNTPs, 40pM/tube of each first round 
primer (ST1.1 upstream primer: 

5« AGGAGTAAGGAAACCCAACGGAC 3' (SEQ ID NO: 99); ST1.1 
downstream primer: 5 • TAAGAGTTGCACAAGTGCG 3' (SEQ ID 

10 NO:100)), 2.5U/tube of taq (Appligene) and lOU/tube of 
AMV-RT (Boehringer) . Each tube iwas further incubated in a 
Perkin-Elmer 480 thermocycler for 10 min at 65°C, followed 
by 2h at 42 °C for cDNA synthesis and 5 min at 95°C for 
inactivation of AMV-RT and DNA denaturation. First round 

15 parameters were 40 cycles of 95°C for 1 min, 53°C for 2.5 
min, 72°C for 1 min, with a final extension of 10 min at 
72 °C. 10/xl of the first round were transferred to the 
second round PCR mix previously treated at 20 °C for 15 min 
with RNase-free DNase 1 (0.02U/jul) followed by DNase 

20 inactivation at 7 0°C for 10 min. This mix contained IX taq 
buffer, 25 nM/tube dNTPs, 40pM/tube of each second round 
primers [ST1.2 upstream primer: 5 • TCAGGGATAGCCCCCATCTAT3 ' 
(SEQ ID N0:101); ST1.2 downstream primer: 

5 ' AACCCTTTGCCACTACATCAATTT 3 • ( S EQ I D NO : 1 0 2 ) ] and 

25 2.5U/tube of taq (Appligene). Second round parameters 
were 30 cycles of 95 °C for 1 min, 53 °C for 1.5 min, 72 °C 
for 1 min, with a final extension of 8 min at 72 °C. 20jil 
of this nested RT-PCR product were deposited on a 0,7% 
agarose gel containing ethidium bromide and exposed to UV 

30 light for the visualization of amplified products. 

- Hybridisation analysis of PCR products: MSRV-pol 

detection by ELOSA 

The protocol was ^essentially as previously 
described (21) but with the following modifications: Nunc 
35 Maxisprb microtitre plates were coated with 100 ng per 
well capture probe CpVlb (see Figure 44) either by passive 
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adsorption (21) or alternatively by using streptavidin 
coated plates and biotinylated CpVlb. Peroxidase-labelled 
detector probe DpVl (see Figure 44) was used and the assay 
cut-off was defined as the mean of 4 negative controls 
5 plus 0.2 OD492 units. 

- RNA extraction, cDNA synthesis and PCR amplification 
from MS plasma samples : 

Total RNA was extracted from human MS plasma by 
a guanidium method as described elsewhere (29) . Total RNA 

10 extracted from 100 ul of plasma, were treated with RNase- 
free DNase I (0.1U/m1; Boehringer Manheim, France) and 
reverse transcribed under the conditions recommended by 
the manufacturer, using Superscript reverse transcriptase 
(Gibco-BRL, FRANCE) . The resulting cDNAs were amplified by 

15 semi-nested PCR through 35 cycles (94 »C l min, 55°C l mn, 
72°C 1 min 30 sec) and 72°C 8 min for a final extension. 
Three different fragments in the RT region were amplified 
by the following specific primers : 

- in the protease (PRT) region, for the 1st and 
20 2nd round of PCR, respectively, sense primer 

[5« TCC AGC AGC AGG ACT GAG GGT 3' (SEQ ID NO: 103) ] and 
antisense primers [5« CTG TCC GTT GGG TTT CCT TAC TCC T 3' 
(SEQ ID NO: 104) / 5' GAC AGC AAA TGG GTA TTC CTT TCC 3' 
(SEQ ID NO: 105) ] 

25 - in the fragment A of the RT region (Cf. Fig 

46) , for the 1st and 2nd round of PCR, respectively, sense 
primer [5» AGG AGT AAG GAA ACC CAA CGG ACA G 3' (SEQ ID 
NO: 106)] and antisense primers [5- TGT ATA TAA TGG TCT GGC 
TAT TGG G 3' (SEQ ID NO: 107) / 5' TTC GGC AGA AAC CTG TTA 

30 TGC CAA GG 3' (SEQ ID NO: 108) ) 

- in the fragment B of the RT region (Cf. Fig. 
46) , for the 1st and 2nd round of PCR, respectively, sense 
primers [5* GGC TCT GCT CAC AGG AGA TTA GAT AC 3' (SEQ ID 
NO: 109) / 5' AAA GGC ACC AGG GCC CTC AGT GAG GA 3« (SEQ ID 

35 NO: 110)] and antisense primer 3«[5« GGT TTA AGA GTT GCA 
CAA GTG CGC AGT C 3' (SEQ ID N0:101)]. 
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The amplified fragments were analysed on 
ethidium bromide-stained agarose gels, cloned in TA 
cloning vector (Invitrogen) and sequenced. 
2) RESULTS 

- Specific retroviral RNA is found in extracellular 
virions from MS patient-derived cell cultures and in MS 
patients 1 CSF. 

Choroid plexus cells (4) (obtained post-mortem) 
and EBV- immortalized peripheral blood B-lymphocytes (30, 
31) from MS patients gave rise to cultures expressing 100- 
120 nm viral particles associated with RT-activity similar 
to that of the original LM7 isolate (3). Similar cell- 
types from non-MS donors produced neither this RT-activity 
nor virions. All the 'infected 1 cultures were poorly 
15 and/or transiently productive and/or had a limited 
lifespan. Therefore, in order to analyse the genomic RNA 
present in the very limited quantity of extracellular 
virions, we used an RT-PCR approach to amplify, with 
degenerate primers, a conserved region of the pol gene 
20 present in all known retroviruses (12) ; the techniques 
based on this approach will be called "Pan-retro" RT-PCR. 
Extensive DNAse treatment of samples and reagents was 
essential, because human DNA contains many endogenous 
retroviral elements amplifiable by this technique. 
25 "Pan-retro" RT-PCR experiments were performed on sucrose- 
density gradient purified virions from supernatants of 
different types of cell cultures and their non-infected 
controls: (i) choroid plexus cells sampled post-mortem 
from MS brain (PLI-1) , (ii) choroid plexus cells from non- 
30 MS brain autopsy, infected by co-culture with irradiated 
LM7 cells (LM7P) , and (iii) identical non-infected 
choroid-plexus cells. "Early" B-cell lines obtained by 
spontaneous in vitro transformation of two EBV- 
seropositive individuals, (iv) one MS patient and (v) one 
35 non-MS control, were also analysed. Figure 43 illustrates 
the RT-activity in sucrose-gradient fractions obtained 
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from the B-cell cultures. The technique described by Shih 
et al. (12) was modified in a semi-nested RT-PCR protocol 
(27) using degenerate primers (Fig. 2) and extensive DNase 
treatment. PCR amplifications were performed in London 
5 (Dpt of Virology, U.C.L.M.S.) on coded aliquots of the 
density gradient fractions. Blind and systematic cloning 
and sequencing of the PCR products were undertaken in an 
independent laboratory (bioMSrieux, Lyon). After complete 
sequencing of 20 to 30 clones per sucrose gradient 
10 fraction, the codes were broken and results analysed in 
parallel with the RT-activity data. 

Table 5 presents the distribution of sequences obtained 
from sucrose gradient fractions containing the peak of 
viral RT-activity in MS-derived cultures and also the 

15 sequences amplified from the corresponding RT-activity 
negative fractions of uninfected cultures. The predominant 
sequence detected in bands of the expected size (1140 bp) 
amplified in all the RT-activity positive fractions (but 
not in the RT-activity negative fractions) was different 

20 from known retroviruses and was designated MSRV-cpol. 
MSRV-cpol sequences exhibited partial homology (70-75%) 
with ERV9, a previously described endogenous retroviral 
sequence (18) . A few ERV9 sequences (>90% homology with 
ERV9) were also present but clearly represented a minority 

25 of clones. In addition to typical pol sequences, numerous 
PCR artefacts (primer roultimers, concatemers or single- 
primer amplifications) related to the use of degenerate 
primers and low-temperature annealing, were found in all 

samples (Table 5) . 
-36' Figure 44 shows an alignment of a consensus sequence of 
MSRV-cpol with the corresponding VLPQG / YMDD region of 
diverse retroviruses. Figure 45 displays a phylogenic tree 
based on the evolutionary conserved amino acid sequences 
of both exogenous and endogenous retroviruses in this 
35 region. From this tree it can be seen that the pol gene of 



104 



10 



MSRV is phylogenically related to the C-type group of 
oncovirinae. 

A small scale study was performed to determine the 
prevalence of MSRV c-pol sequences in the CSF of patients 
with MS. Identification of MSRV-cpol in PCR products by 
cloning and sequencing is both laborious and time 
consuming. We therefore devised an enzyme-linked 
oligosorbent assay (ELOSA) , using a capture probe (CpVlB) 
and a peroxidase-labelled detector probe (DpVl) , for the 
rapid identification of MSRV-cpol sequences in ~Pan- 
retrovirus' PCR products (Figure 44). The specificity of 
this sandwich hybridisation-based assay for HMSRV-cpol was 
■ tested with both distantly related (HIV and MoMLV) and 
closely related (ERV9) pol sequences. No significant cross 
15 reactivity with such targets was observed despite the 
ability of the ELOSA to detect as little as 0.01 ng of 
MSRV-cpol DNA. 

Cerebrospinal fluid (CSF) samples were available from 10 
patients with MS and from 10 patients with other 

20 neurological disorders. Total RNA was extracted from CSF 
pellets, reverse transcribed and amplified as above. ELOSA 
analysis (Table 6) of the PCR products revealed MSRV-cpol 
sequences in 5 of the 10 MS patient samples but in none of 
the 10 samples from patients with other neurological 

25 diseases (P<0.05). The presence of MSRV-cpol did not 
appear to be correlated with age, sex or type of MS, but 
was seen in untreated patients only (5/6). No patient with 
immunosuppressive therapy was found positive (0/4). No 
correlation between MSRV-cpol detection and CSF cell count 

30 was observed. 

- Cloning and sequencing a larger region of the pol gene 

An independent identification of the MSRV 

genomic sequence was obtained by a non-PCR approach using 

RNA extracted from concentrated virions derived from 2,5 
35 liters of LM7-infected sub-cultures of choroid plexus 

cells. A limited number of clones was obtained by direct 
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cloning of the cDNA, one of which (PSJ17) showed partial 
homology with ERV9 pol. Specific primers based on the 
MSRV-cpol region and on the PSJ17 clone, amplified a 740 
bp fragment linking the two independent sequences in RNA 
5 extracted from purified virions. PSJ17 was localised on 
the 3' side of MSRV-cpol. Further sequence extension on 
the 5« side of MSRV-cpol and on the 3' side of PSJ17 , was 
obtained using RT-PCR approaches on RNA from purified LM7- 
like virions produced in MS choroid plexus cultures (4). 

10 In Figure 46, the nucleotide* sequence 

corresponding to overlapping clones obtained by sequence 
extension in the pol gene is represented with the 
aminoacid translation corresponding to the putative open 
reading frames (ORFs) of the protease and of the reverse- 

15 transcriptase. The active site motifs of the protease 
(PRT) and of the reverse-transcriptase (RT) are 
underlined. In the C-terminal region of the RT sequence, 
the dispersed amino acid residues regularly present in 
retroviral RNase H domains, are also underlined. 

20 - Non-degenerate primers detect MSRV-specif ic RNA in 
virions associated with the peak of RT-actiyity . and in 
in MS patients' plasma 

PCR primers (ST1.1 primer set; positions 603-625/1732- 
1714, on Fig. 4) based on overlapping clones in the pol 
25 gene! amplified a 1.15 kb segment of the RT region from 
several different isolates obtained from different MS 
patients. Nested primers (ST1.2; positions 869-889/1513- 
1490, on Fig. 46) generated a 700 bp fragment (Figure 47) 
which was more easily visualised by ethidium bromide 
staining than the first round product generated by ST1.1. 
The specificity of PCR products was confirmed by stringent 
hybridisation with a peroxidase-labeled MSRV-cpol probe 
(Fig. 44), using the ELOSA technique (21). 

The ST1.1 and 2 primer set was used to detect 
extracellular MSRV RNA in human plasma, although non- 
optimal for this application. Figure 47 illustrates the 
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results of PCR amplification of cDNA derived from 2 
patient and 2 control plasma samples tested in parallel 
with cDNA from the sucrose density gradient fractions of 
an MS choroid plexus isolate. Taq-sequencing of the 700 bp 
5 bands confirmed the presence of MSRV sequence. A very 
faint 700 bp band is also visible in fraction 10 which 
corresponds to the bottom of the tube where aggregated 
particles usually sediment. Control RT-PCR for cellular 
aldolase transcripts on plasma-derived RNA was negative, 
10 indicating that the results were not due to cellular RNA 
released by cell lysis during plasma separation. It should 
be noted that this PCR technique was not designed for 
epidemiological studies since its sensitivity is impaired 
by the length of the cDNA required (1.15 kb) . 
15 Hon degenerate primers amplifying three 

fragments of the pol gene (the whole protease region, 
regions A and B of the reverse transcriptase; Cf. Fig. 46) 
were also used to confirm the presence of MSRV sequences 
in DNase-treated RNA from MS plasma. These fragments were 
20 amplified from the plasma of a further 4 MS patients with 
active disease. Sequence analysis confirmed that the PRT 
and RT regions were homologous (>95% and >90% 
respectively) to MSRV sequences previously obtained on 
culture virion. No such sequence were detected in plasma 
25 from healthy controls <n=4) , tested in parallel with MS 

plasma. 

3) DISCUSSION 

- Phylogeny of MSRV 

From the results of this study, it can be 
30 concluded ttattf the virus previously referred to as «LM7« 
(3, 5, 26) posseses an RNA genome containing the MSRV pol 
sequences described here. 

The conserved RT motif of both MSRV and ERV9 is two amino 
acids shorter than that of other retroviruses, apart from 
35 human foamy viruses which nonetheless have a functional 
RT The potential ORF encompassing the entire PRT-RT 
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region is consistent with the virion-associated In- 
activity detected in sucrose density gradients with 
infected culture supernatants . Moreover, since we have 
recently succeeded in expressing a recombinant protein 
from the sequence of MSRV protease cloned from MS plasma, 
we can confirm the reality of the potential PRT ORF. 
Similar cloning and expression of other sequences 
containing potential ORFs for MSRV proteins, is being 
undertaken to confirm their ability to encode enzymes and 
structural proteins of MSRV virions. 

The phylogenic tree in Figure 45, based on the most 
conserved amino acid sequence in retroviruses 
(VLPQG^'.YXDD) , shows that the MSRV pol gene is related to 
the C-type oncoviruses. Apart from ERV9 , the closest known 
retroviral element is RTLV-H, a human endogenous sequence 
known to have a subtype with a functional pol gene (32). 
In the pol region, this phylogenic affiliation to C-type 
oncoviruses apparently contradicts our previous 
assumptions based on the general morphology of the 
particles observed by electron microscopy (EM) , which were 
compatible with a B or D-type oncovirus (3, 5, 26). 
However, preliminary data on env sequences detected in 
MSRV virions, would suggest a greater phylogenic proximity 
to D-type. Such difference in phylogenies of the pol and 
env genes have been described in MPMV and suggest a 
recombinatorial origin in D-type retroviruses (33) . D to C 
type morphological conversion is also possible since it 
has been reported that a single amino acid substitution in 
the gag protein can convert retrovirus morphology to that 
of a different type (34)* 

- Is MSRV an exogenous retrovirus sharing extensive 
homology with a related endogenous retrovirus family or an 
endogenous retrovirus producing extracellular virions? 

Southern blot analysis with an MSRV pol probe 
under stringent conditions, showed hybridisation with a 
multicopy endogenous family (data not presented), 
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indicating the existence of endogenous elements more 
closely related to MSRV than ERV9 itself. Consequently, we 
were unable to look for a vir ion-specific provirus in 
MSRV-producing cells. In agreement with southern blot 
5 findings, PCR studies on genomic DNA showed multiple band 
amplification of MSRV-related endogenous sequences. Since 
pol is the most conserved retroviral gene, the sequence 
described here is the least suitable region to 
discriminate between exogenous and endogenous sequences. 

10 It is hoped that sequence information from other parts of 
the genome may permit such a discrimination, would it be 
on a tiny portion as has recently been demonstrated for 
the Jaagsiekte retrovirus (JSRV) of sheep (35). With such 
sequence data, it would then become possible to identify 

15 the MSRV-specific provirus in the genome of virion- 
producing cell cultures. 

MSRV could represent a virion-producing exogenous member 
of an ERV9-like endogenous family, just as exogenous 
strains exist in the well-studied mouse mammary tumour 
20 virus (MMTV) and murine leukaemia virus (MuLV) retroviral 
families of mice, and also, in the JSRV retroviral family 
of sheep (36). Alternatively, it is also conceivable that 
the extracellular MSRV virions may be produced by a 
replication-competent endogenous provirus. Wether MSRV is 
25 exogenous or endogenous, conceptual similarities exist 
with the category of retroviruses represented by MuLV, 
MMTV and JSRV. Unlike defective endogenous elements, this 
category of agents are known to produce infectious and 
pathogenic virions, to cause neurological disease (37), 
30 solid tumours / leukaemias (36, 38) and to express 
"endogenous superantigens" (39, 40). Furthermore, in MuLV 
infections, the genetic endogenous retroviral background 
of the mouse strain can determine susceptibility or 
resistance to disease (39, 41). Indeed, such interactions 
between an infectious retrovirus and its endogenous 
counterpart may be relevant in the pathogenesis of MS, 
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since endogenous retroviral genotypes are not identical in 
all individuals. A genetic control due to related 
endogenous retroviral genotypes could therefore contribute 
to the known hereditary susceptibility to MS (43) , if MSRV 
does indeed play an active role in this disease. 
Elsewhere, the data in Table 5 suggest that ERV9 elements 
may be co-expressed, possibly via trans-activation in 
infected cells, and give rise to heterologous RNA 
packaging in MSRV virions. Such heterologous packaging is 
known to occur in other retroviral systems (42) . 
- A role for the numerous common viruses previously evoked 
in MS ? 

Among the numerous reports of viruses putatively 
involved in the aetiopathogenesis of MS, a significant 

15 proportion focus on two viral families, the 
paramyxoviridae and the herpesviridae. Regarding the 
paramyxoviridae, the key observation is of a frequently 
increased antibody titer to measles virus in MS patients 
essentially directed, in CSF, against measles fusion 

20 protein (44) . The existence of aminoacid similarities 
between conserved domains of the fusion proteins of 
paramyxoviridae and the transmembrane protein of 
retroviruses (45), may explain this observation if 
antigenic cross-reactivity between these two proteins 

25 occured. 

With regard to the herpesvirus family, the involvement of 
Epstein-Barr Virus (EBV) , Herpes Simplex Virus type 1 
(HSV-1) and, most recently, Human Herpes Virus 6 (HHV-6) 
has been proposed (31, 46, 47). From our previous studies 
30 and from those of other groups, it appears that 
herpesviruses may play an important role in MSRV 
expression: we have shown that HSV-l immediate-early ICPO 
and ICP4 proteins can transactivate MSRV/LM7 in vitro (6) 
and Haahr et al. have proposed an important 
epidemiological role for EBV, as a co-factor in MS, 
triggering retrovirus reactivation (31). The recent 
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description by Challoner et al. (47) showing significant 
expression of HHV6 proteins in MS plaques may also suggest 
a similar role for HHV6 in the brain. 

5 EXAMPLE 19 : MSRV GENOME DETECTION TECHNIQUE 

Following 0.4 nm filtration to remove cellular 
debris and RNase digestion to remove residual non- 
encapsidated RNA, serum was processed to extract viral RNA 
by means of adsorption to a silica matrix. Viral RNA was 
10 subjected to DNase digestion, then a combined reverse 
transcription-PCR (RT-PCR) reaction was performed using 
primers PTpol-A (sense: 5'xxxx3', SEQIDNO:183) and 
PTpol-F (antisense: 5'xxxx3\ SEQ ID NO: 184) . A second 
round of amplification with nested primers PTpol-B (sense: 
15 5«xxxx3«, SEQ ID NO: 185) and PTpol-E (antisense: 5«xxxx3\ 
SEQ ID NO: 186) generated a 435 bp PCR product which was 
identified by gel electrophoresis. The specificity of each 
product was confirmed by dideoxy sequencing. Control 
reactions without reverse transcriptase were performed to 
20 ensure that the products were derived from viral RNA. In 
addition, to exclude the possibility that the extracted 
viral RNA might be contaminated with host cell derived 
nucleic acids, aliquots were tested by nested PCR for the 
presence of pyruvate dehydrogenase (PDH) DNA and RNA. 
25 Samples which generated a signal in either the PDH or the 
"no- RT" PCR assays were excluded from the analysis. 

Sera from patients with clinically active MS and 
controls were amplified by RT-PCR and sequenced. Virion 
associated MSRV-RNA was detected in the serum of 10 of 19 
30 (53%) patients with MS but in only 3 of 44 controls 
without MS (P=0.0001). The control group consisted of 8 
patients (all MSRV-RNA negative) with rheumatological 
disorders and 36 healthy adults. MSRV-RNA titres in both 
MS patients and controls were apparently low because even 
35 moderate dilution of sera (<10 fold) caused loss of 
signal. 
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In MS patients, detection of MSRV-RNA was not 
associated with age, sex, disease duration, or MS type, 
however a significant negative correlation with treatment 
was observed. 26 serum samples were obtained from the 19 
patients ; 100% of the sera from untreated patients 
contained detectable MSRV-RNA whereas it was detectable in 
only 4 of 19 samples (21%) obtained during treatment with 
corticosteroids and/or azathioprine (P=0.001). 

The reason for the apparent loss of virion 
associated MSRV-RNA during immunosupressive treatment is 
unknown but the finding is in agreement with the previous 
observations on the detection of MSRV in cerebrospinal 
fluid. 



TABLE 7 

DETECTION OF VIRION ASSOCIATED MSRV-RNA IN MS UNTREATED 

PATIENTS & CONTROLS 





Positive 


Neqat ive 


Total 


% Positive 


Controls without MS a 


3 b 


41 


44 


7% 












MS sera untreated at 
time of samplincr 


7 


0 


7 


100% 



a The control group consisted of 8 patients with 
miscellaneous non-MS disorders and 36 healthy adults, 
b Th e detection of MSRV RNA in plasma of a few controls in 
conditions which select vir ion-packaged RNA, is consistent 
with the knowledge that a virus associated with MS should 
be present in a minor proportion of apparently healthy 
population, indeed, such individuals can be either healthy 
carriers or be in the pre-clinical (or sub-clinical) phase 
of the disease which can last for years. 
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METHOD : 

- Modified SNAP RNA extraction with filtration and RNase 
digestion 

(All centrifugations are at room temperature) 
5 Up to 500 microlitres of serum is filtered using 

0.45 micron spin filters (Nanosep MF from Flowgen 
Catalogue No. U3-0126 Ref . ODM45) . The serum is spun for 
5 rain at 130,000 g (or for further 10 roin if necessary). 

150 microlitres of filtered serum is incubated 
10 with 10 units RNase One (Promega Catalogue No.M426l) for 

30 min at 37°C. 

The 150 microlitres was then extracted using the 

SNAP RNA extraction kit (Invitrogen) as below: 

- 10 micrograms of poly A RNA was added to the 
15 450 microlitres of Binding Buffer to act as a carrier ; 
this was then added to the serum and mixed by inversion 6 
times ; 300 microlitres of propan-2-ol was then added and 
mixed by inversion 10 times ; 500 microlitres was 
transferred to the SNAP column and spun at 1300 g for 
20 1 min and the flow-through discarded ; the remainder was 
then added to the SNAP column and spun at 13 00 g for 1 min 
and the flow-through discarded ; the .column was then 
washed with 600 microlitres of Super wash and the flow- 
through discarded ; the column was then washed with 600 
25 microlitres of lx RNA wash and the flow-through 
discarded ; this wash was repeated with a 2 min 1300 g 
spin and the flow-through discarded ; the bound nucleic 
acid was then eluted by incubating with 135 microlitres of 
RNase free water for 5 min and spun at 1300 g for 1 min. 
30 _ 15 microlitres of 10x DNAse buffer and 3 

microlitres (30 units) of DNase I, RNase free (Boehringer 
Mannheim Cat. No. 776 785) was added and incubated for 30 
min at 37 "C ; 450 microlitres of Binding Buffer was added 
and mixed by inversion 6 times ; 300 microlitres of 
35 propan-2-ol was then added and mixed by inversion 10 
times ; 500 microlitres was transferred to the SNAP column 
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and spun at 1300 g for 1 min and the flow-through 
discarded ; the remainder was then added to the SNAP 
column and spun at 1300 g for 1 min and the flow-through 
discarded ; the column was then washed with 600 
microlitres lx RNA wash and the flow-through discarded ; 
this wash was repeated with a 2 min 1300 g spin and the 
flow-through discarded ; the bound nucleic acid was then 
eluted by incubating with 105 microlitres of RNase free 
water for 5 min and spun at 1300 g for 1 min. 



- Titan RT-PCR 

RT-PCR was performed using the Titan one tube RT- 
PCR system (Boehringer Mannheim Cat. No. 1 855 476) 25 
microlitres of RNA was used in the combined RT-PCR 
reaction. The total reaction volume was 50 microlitres. 
Promega rRNAsin (10 units) was the RNase inhibitor used. 
170 ng of primers SEQ ID NO:183 and SEQ ID NO:184, 
respectively, were used. A single master mix was prepared 
and the sample RNA added last. This was performed at room 

20 temperature, not on ice. 

The RT step consisted of two sequential 30 min 
incubations at 50°C and then 60°C. This was immediately 
followed by the PCR which had the following steps. 

* Initial denaturation of template at 94«C for 2 min, 

* 40 cycles of 94"C for 30 seconds ; 60-C for 30 seconds ; 
68 °C for 45 seconds, 

* 1 cycle of 68 °C for 7 min. 
The second round PCR was performed using the 

Expand long template PCR system (Boehringer Mannheim Cat. 
NO. 1681 842). 0.5 microlitres of the RT-PCR mix was added 
to 25 microlitres of the round 2 PCR mix. Buffer No. 3 and 
50 ng of primers B and E were used. The PCR had the 
following steps: 

* 5 cycles of 94-C for 30 seconds, 60°C for 30 seconds., 
35 68 °C for 45 seconds, 

* l cycle of 68 °C for 7 min. 
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The PCR products were then run on a 2% agarose 

gel. 

The no RT controls were performed using "Expand" 
PCR system for both rounds. The first round was 40 cycles 
5 and the second round 20 cycles. 

As a positive control a DNA dilution series was 
used in both the RT-PCR and the "no RT" PCR. For a result 
to be valid the RT-PCR and "no-RT" PCRs had to have 
detected DNA equivalent to between 1 and 0.1 cells. 
10 T he analysis of PCR products of an approximately 

435 bp fragment in the pol region is shown in Table 8 . 

TABLE 8 



15 







ANALYSIS 


OF PCR 


PRODUCTS 


WITH 


ORF * 


Exp 


Disease 


Clone 


ORF 


Fragment 


(bp) 


AA-RT Motif Site 


46-7 


MS 


1 
5 


+ 
+ 


429 
429 




YGDD 
YGDD 






8 


+ 


429 




YGDD 


68-1 


MS 


41 
42 


+ 
+ 


438 
438 




YMDD 
YMDD 






43 


+ 


438 




YMDD 



20 



25 



30 



* Defective RNA can also be present in circulating 
virions, since the fidelity of the MSRV reverse 
transcriptase appears to be low and since recombination 
events with related endogenous elements can occur. It is 
then obvious that the intra- and inter- patients 
variability can be greater than that illustrated in this 
example, because of these encapsidated defective MSRV RNA 
copies. 

Table 9 which data have been determined from the 
35 alignments of Figures 49 to 53, shows a variability : 
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- between the clones obtained from the same patient plasma 
sample in the same PCR amplification experiment ; this 
means that the patient possesses a virion population which 
comprises different MSRV variants at a given time, 

- between the sequenced variant populations from different 
patients ; this means that the variants differ from a 
patient to another patient* 



TABLE 9 

10 Degree of identity (percentage) between nucleotide 

sequences and between peptide sequences, 
by direct comparison of said sequences (see Figures 49-53) 




Nucleotide 
sequences 



Peptide 
sequences 



between SEQ ID NO: 169 
and MSRV-pol (SEQ ID NO:l) 
90,4 % b 

92,3 % a 

SEQ ID NOs:170 f 171, 
172 between them 

98.6 % b 

98.7 % a 



between SEQ ID NOs:173, 
174, 175 and SEQ ID NO: 
81 % 

SEQ ID NOs:173, 174, 175 
between them 

97 % 



between SEQ ID NO: 176 
and MSRV-pol (SEQ ID NO:l) 
82,5 % a 

84 % b 

SEQ ID NOs:177, 178, 
179 between them 
94,5 % a 

b 



95,1 % 



between SEQ ID NOs:180, 
181, 182 and SEQ ID NO: 
73,5 % 

SEQ ID NOs:180, 181, 182 
between them 

89 % 



15 a) this percentage is determined on the basis of sequences 
excluding the primers 

b) this percentage is determined on the basis of sequences 
including the primers. 

c*> n the variability between tested 
20 From Figures 53A and 53B, tne vanaun j 

patients sequences can be determined : 
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- between SEQ ID NO: 169 and SEQ ID NO: 176 : 16,5 % a and 
14,8 % b 

- between the peptide sequences obtained from 
SEQ ID NO:169 and SEQ ID NO: 176 : 20 %. 

Four microorganisms are mentioned in the 
specification page 3 lines 15-26 and they are identified 
below. They have all been deposited with the ECACC*, in 
accordance with the provisions of the Budapest Treaty. 



,- LM7PC deposited on 22nd July 1992 under No. 92072201, 

- PLI-2 deposited on 8th January 1993 under No. 93010817, 

- POL-2 deposited on 22nd July 1992 under No. V92072202, 
and 

15 - MS7PG deposited on 8th January 1993 under No. V93010816. 

* ECACC : European Collection of Animal Cell Cultures 
Vaccine Research and Production Laboratory 
Public Health Laboratory Service 
20 Centre of Applied Microbiology and Research 

Porton Down 

Salisbury, Wiltshire SP4 OJG 
United Kingdom 
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(1) GENERAL INFORMATION: 
5 (i) APPLICANT: BIO MERIEUX 

(ii) TITLE OF THE INVENTION: VIRAL MATERIAL AND NUCLEOTIDE 
FRAGMENTS ASSOCIATED WITH MULTIPLE SCLEROSIS, FOR DIAGNOSTIC, 
10 PROPHYLACTIC AND THERAPEUTIC PURPOSES 

(iii) NUMBER OF SEQUENCES: 160 
15 (iv) CORRESPONDENCE ADDRESS: 



20 



(A) ADDRESSEE: CABINET GERMAIN & MAUREAU 

(B) STREET: 12 rue Boileau 

(C) CITY: LYON 

( D ) COUNTRY : FRANCE 

(E) ZIP: 69006 



25 



COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1*0, Version #1. 



30 



(vi) 



CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



35 



(viii) 
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(ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: 4 72 69 84 30 

(B) TELEFAX: 4 72 69 84 31 

5 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1158 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CCCTTTGCCA CTACATCAAT TTTAGGAGTA AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 
CAAGAACTCA GGATTATCAA TGAGGCTGTT GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 
TATACAGTGC TTTCCCAAAT ACCAGAGGAA GCAGAGTGGT TTACAGTCCT GGACCTTAAG 180 
20 GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 
CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAGGGATAGC 300. 
CCCCATCTAT TTGGCCAGGC ATTAGCCCAA GACTTGAGTC AATTCTCATA CCTGGACACT 360 
CTTGTCCTTC AGTACATGGA TGATTTACTT TTAGTCGCCC GTTCAGAAAC CTTGTGCCAT 420 
CAAGCCACCC AAGAACTCTT AACTTTCCTC ACTACCTGTG GCTACAAGGT TTCCAAACCA 480 

25 AAGGCTCGGC TCTGCTCACA GGAGATTAGA TACTNAGGGC TAAAATTATC CAAAGGCACC 540 
AGGGCCCTCA GTGAGGAACG TATCCAGCCT ATACTGGCTT ATCCTCATCC CAAAACCCTA 600 
AAGCAACTAA GAGGGTTCCT TGGCATAACA GGTTTCTGCC GAAAACAGAT TCCCAGGTAC 660 
AS CCCAAT AG CCAGACCATT ATATACACTA ATTANGGAAA CTCAGAAAGC CAATACCTAT 720 
TTAGTAAGAT GGACACCTAC AGAAGTGGCT TTCCAGGCCC TAAAGAAGGC CCTAACCCAA 780 

3 0 GCCCCAGTGT TCAGCTTGCC AACAGGGCAA GATTTTTCTT TATATGCCAC AGAAAAAACA 840 
GGAATAGCTC TAGGAGTCCT TACGCAGGTC TCAGGGATGA GCTTGCAACC CGTGGTATAC 900 
CTGAGTAAGG AAATTGATGT AGTGGCAAAG GGTTGGCCTC ATNGTTTATG GGTAATGGNG 960 
GCAGTAGCAG TCTNAGTATC TGAAGCAGTT AAAATAATAC AGGGAAGAGA TCTTNCTGTG 1020 
TGGACATCTC ATGATGTGAA CGGCATACTC ACTGCTAAAG GAGACTTGTG GTTGTCAGAC 1080 

35 AACCATTTAC TTAANTATCA GGCTCTATTA CTTGAAGAGC CAGTGCTGNG ACTG CGC ACT 1140 

1158 

TGTGCAACTC TTAAACCC 




(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 297 base pairs 

( B) TYPE : nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



10 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CCCTTTGCCA CTACATCAAT TTTAGGAGTA AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 

15 CAAGAACTCA GGATTATCAA TGAGGCTGTT GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 

TATACAGTGC TTTCCCAAAT ACCAGAGGAA GCAGAGTGGT TTACAGTCCT GGACCTTAAG 180 

GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 

CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAAGGGA 297 



20 



<2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTTTAGGGAT ANCCCTCATC TCTTTGGTCA GGTACTGGCC CAAGATCTAG GCCACTTCTC 60 
AGGTCCAGSN ACTCTGTYCC TTCAG 85 



35 



WU ADO 




(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTTCAGGGAT AGCCCCCATC TATTTGGCCA GGCACTAGCT CAATACTTGA GCCAGTTCTC 
ATACCTGGAC AYTCTYGTCC TTCGGT 

15 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 85 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

. GTTCARRGAT AGCCCCCATC TATTTGGCCW RGYATTAGCC CAAGACTTGA GYCAATTCTC 
30 ATACCTGGAC ACTCTTGTCC TTYRG 

(2) INFORMATION FOR SEQ ID NO: 6: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 



WO 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTTCAGGGAT AGCTCCCATC TATTTGGCCT GGCATTAACC CGAGACTTAA GCCAGTTCTY 
10 ATACGTGGAC ACTCTTGTCC TTTGG 

(2) INFORMATION FOR SEQ ID NO: 7: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 base pairs 

( B ) TYPE : nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

{ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

25 

GTGTTGCCAC AGGGGTTTAR RGATANCYCY CATCTMTTTG GYCWRGYAYT RRCYCRAKAY 
YTRRGYCAVT TCTYAKRYSY RGSNAYTCTB KYCCTTYRGT ACATGGATGA C 

30 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 645 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 




(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

5 



TCAGGGATAG 


CCCCCATCTA 


TTTGGCCAGG 


CATTAGCCCA 


AGACTTGAGT 


CAATTCTCAT 


60 


ACCTGGACAC 


TCTTGTCCTT 


CAGTACATGG 


ATGATTTACT 


TTTAGTCGCC 


CGTTCAGAAA 


120 


CCTTGTGCCA 


TCAAGCCACC 


CAAGAACTCT 


TAACTTTCCT 


CACTACCTGT 


GGCTACAAGG 


180 


TTTCCAAACC 


AAAGGCTCGG 


CTCTGCTCAC 


AGGAGATTAG 


ATACTNAGGG 


CTAAAATTAT 


240 


CCAAAGGCAC 


CAGGGCCCTC 


AGTGAGGAAC 


GTATCCAGCC 


TATACTGGCT 


TATCCTCATC 


300 


CCAAAACCCT 


AAAGCAACTA 


AGAGGGTTCC 


TTGGCATAAC 


AGGTTTCTGC 


CGAAAACAGA 


360 


TTCCCAGGTA 


CASCCCAATA 


GCCAGACCAT 


TATATACACT 


AATTANGGAA 


ACTCAGAAAG 


420 


CCAATACCTA 


TTTAGTAAGA 


TGGACACCTA 


CAGAAGTGGC 


TTTCCAGGCC 


CTAAAGAAGG 


480 


CCCTAACCCA 


AGCCCCAGTG 


TTCAGCTTGC 


CAACAGGGCA 


AGATTTTTCT 


TTATATGCCA 


540 


CAGAAAAAAC 


AGGAATAGCT 


CTAGGAGTCC 


TTACGCAGGT 


CTCAGGGATG 


AGCTTGCAAC 


600 


CCGTGGTATA 


CCTGAGTAAG 


GAAATTGATG 


TAGTGGCAAA 


GGGTT 




645 



(2) INFORMATION FOR SEQ ID NO: 9: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 741 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

30 

CAAGCCACCC AAGAACTCTT AAATTTCCTC ACTACCTGTG GCTACAAGGT TTCCAAACCA 60 

AAGGCTCAGC TCTGCTCACA GGAGATTAGA TACTTAGGGT TAAAATTATC CAAAGGCACC 120 

AGGGGCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTT ATCCTCATCC CAAAACCCTA 180 

AAGCAACTAA GAGGGTTCCT TAGCATGATC AGGTTTCTGC CGAAAACAAG ATTCCCAGGT 240 

35 ACAACCAAAA TAGCCAGACC ATTATATACA CTAATTAAGG AAACTCAGAA AGCCAATACC 300 

TATTTAGTAA GATGGACACC TAAACAGAAG GCTTTCCAGG CCCTAAAGAA GGCCCTAACC 360 



WO 



CAAGCCCCAG TGTTCAGCTT GCCAACAGGG 
ACAGGAATCG CTCTAGGAGT CCTTACACAG 
TACCTGAATA AGGAAATTGA TGTAGTGGCA 
GNGGCAGTAG CAGTCTNAGT ATCTGAAGCA 
5 GTGTGGACAT CTCATGATGT GAACGGCATA 
GACAACCATT TACTTAANTA TCAGGCTCTA 
ACTTGTGCAA CTCTTAAACC C 




128 - fc-. 

CAAGATTTTT CTTTATATGG CACAGAAAAA 420 

GTCCGAGGGA TGAGCTTGCA ACCCGTGGCA 480 

AAGGGTTGGC CTCATNGTTT ATGGGTAATG 540 

GTTAAAATAA TACAGGGAAG AGATCTTNCT 600 

CTCACTGCTA AAGGAGACTT GTGGTTGTCA 660 

TTACTTGAAG AGCCAGTGCT GNGACTGCGC 720 

741 



10 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGGAAAGTGT TGCCACAGGG CGCTGAAGCC TATCGCGTGC AGTTGCCGGA TGCCGCCTAT 60 
AGCCTCTACA TGGATGACAT CCTGCTGGCC TCC 93 

25 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 
3Q (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 




TTGGATCCAG TGYTGCCACA GGGCGCTGAA GCCTATCGCG TGCAGTTGCC GGATGCCGCC 60 
TATAGCCTCT ACGTGGATGA CCTSCTGAAG CTTGAG 96 

5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 748 base pairs 
1 o (B) TYPE: nucl eot ide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



25 



TGCAAGCTTC 


ACCGCTTGCT 


GGATGTAGGC 


CTCAGTACCG 


GNGTGCCCCG 


CGCGCTGTAG 


60 


TTCGATGTAG 


AAAGCGCCCG 


GAAACACGCG 


GGACCAATGC 


GTCGCCAGCT 


TGCGCGCCAG 


120 


CGCCTCGTTG 


CCATTGGCCA 


GCGCCACGCC 


GATATCACCC 


GCCATGGCGC 


CGGAGAGCGC 


180 


CAGCAGACCG 


GCGGCCAGCG 


GCGCATTCTC 


AACGCCGGGC 


TCGTCGAACC 


ATTCGGGGGC 


240 


GATTTCCGCA 


CGACCGCGAT 


GCTGGTTGGA 


GAGCCAGGCC 


CTGGCCAGCA 


ACTGGCACAG 


300 


GTTCAGGTAA 


CCCTGCTTGT 


CCCGCACCAA 


CAGCAGCAGG 


CGGGTCGGCT 


TGTCGCGCTC 


360 


GTCGTGATTG GTGATCCACA 


CGTCAGCCCC 


GACGATGGGC 


TTCACGCCCT 


TGCCACGCGC 


420 


TTCCTTGTAG 


ANGCGCACCA 


GCCCGAAGGC 


ATTGGCGAGA 


TCGGTCAGCG 


CCAAGGCGCC 


480 


CATGCCATCT 


TTGGCGGCAG 


CCTTGACGGC 


ATCGTCGAGA 


CGGACATTGC 


CATCGACGAC 


540 


GGAATATTCG 


GAGTGGAGAC 


GGAGGTGGAC 


GAAGCGCGGC 


GAATTCATCC 


GCGTATTGTA 


600 


ACGGGTGACA 


CCTTCCGCAA 


AGCATTCCGG 


ACGTGCCCGA 


TTGACCCGGA 


GCAACCCCGC 


660 


ACGGCTGCGC 


GGGCAGTTAT 


AATTTCGGCT 


TACGAATCAA 


CGGGTTACCC 


CAGGGCGCTG 


720 


AAGCCTATCG 


CGTGCAGTTG 


CCGGATGC 








748 



(2) INFORMATION FOR SEQ ID NO: 13: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 



WO 98/23755 



130 



(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
GCATCCGGCA ACTGCACG 

10 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE. DESCRIPTION: SEQ ID NO: U 

V'* 

GTAGTTCGAT GTAGAAAGCG 



25 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
3 0 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 



WO y»/»755 
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GCATCCGGCA ACTGCACG 

5 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 

AGGAGTAAGG AAACCCAACG GAC 



20 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 

TAAGAGTTGC ACAAGTGCG 



35 (2) INFORMATION FOR SEQ ID NO: 18: 



WO SWZ3755 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 

10 

TCAGGGATAG CCCCCATCTA T 
(2) INFORMATION FOR SEQ ID NO: 19: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

25 

AACCCTTTGC CACTACATCA ATTT 



(2) INFORMATION FOR SEQ ID NO: 20: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



WO 98/23755 
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( i i ) MOLECULE TYPE : cDNA 

(ix) FEATURES: 

(B) LOCATION: 5 # 7, 10, 13 

(D) OTHER INFORMATION: G represents inosine (i) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



GGTCGTGCCG CAGGG 

10 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

20 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TTAGGGATAG CCCTCATCTC T 

25 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 21 base pairs 

( B ) TYPE : nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



21 



(ii) MOLECULE TYPE: cDNA 




(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 
TCAGGGATAG CCCCCATCTA T 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

10 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
AACCCTTTGC CACTACATCA ATTT 

20 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



GCGTAAGGAC TCCTAGAGCT ATT 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

5 ( B ) TYPE : nu c leot ide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
TCATCCATGT ACCGAAGG 

15 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 
20 ( B ) TYPE : nucleot ide 

( C ) STRANDEDNESS : 6 ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 

ATGGGGTTCC CAAGTTCCCT 



30 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



WO 98/23755 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

5 

GCCGATATCA CCCGCCATGG 
(2) INFORMATION FOR SEQ ID NO: 28: 

10 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

( B ) TYPE : nucleot ide 

(C) STRAND EDNES S : single 
15 ( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

20 

GCATCCGGCA ACTGCACG 



(2) INFORMATION FOR SEQ ID NO: 29: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 

35 

CGCGATGCTG GTTGGAGAGC 




(2) INFORMATION FOR SEQ ID NO: 30: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
15 TCTCCACTCC GAATATTCCG 



(2) INFORMATION FOR SEQ ID NO: 31: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: CDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
30 GATCTAGGCC ACTTCTCAGG TCCAGS 



(2) INFORMATION FOR SEQ ID NO: 32: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 




(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 



( ix ) FEATURES : 

(B) LOCATION: 6, 12, 19 

(D) OTHER INFORMATION: G represents inosine (i) 

10 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32 
CATCTGTTTG GGCAGGCAGT AGC 23 



15 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CTTGAGCCAG TTCTCATACC TGGA 

30 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 



139 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 



AGTGYTRCCM CARGGCGCTG AA 



10 (2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: cDNA 



20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 

GMGGCCAGCA GSAKGTCATC CA 



(2) INFORMATION FOR SEQ ID NO: 36: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

( B ) TYPE : nuc 1 eot ide 

(C) STRANDEDNESS : single 
3 0 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

35 

GGATGCCGCC TATAGCCTCT AC 



140 



(2) INFORMATION FOR SEQ ID NO: 37: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
15 AAGCCTATCG CGTGCAGTTG CC 



(2) INFORMATION FOR SEQ ID NO: 38: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cPNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
3 0 TAAAGATCTA GAATTCGGCT ATAGGCGGCA TCCGGCAAGT 



(2) INFORMATION FOR SEQ ID NO: 39 



35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 50 amino acids 



WO 
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(B) TYP? : amino acid 
(ii) MOLECULE TYPE : peptide 
5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39 



Asp Ala Phe Phe Cys lie Pro 



Val Arg Pro Asp Ser Gin Phe Leu Phe 



1 5 10 15 

Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gin Leu Thr Trp Thr Val 
10 20 25 30 

Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Gin Ala Leu 
35 40 45 

Ala Gin 
50 

15 

(2) INFORMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 150 base pairs 
20 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



25 



(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40 

GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 60 
CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAGGGATAGC 120 
3 0 CCCCATCTAT TTGGCCAGGC ATTAGCCCAA 150 



(2) INFORMATION FOR SEQ ID NO: 41 



35 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 11 amino acids 
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(B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 41 

Cys He Pro Val Arg Pro Asp Ser Gin Phe Leu 
15 10 



10 (2) INFORMATION FOR SEQ ID NO: 42 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 17 amino acids 

(B) TYPE : amino acid 

15 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 42 

20 Val Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Glu Ala 
1 5 10 15 

* Leu 
17 



25 

(2) INFORMATION FOR SEQ ID NO: 43 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acid 
30 (B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43 

35 

Leu Phe Ala Phe Glu Asp Pro Leu 
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1 5 8 

(2) INFORMATION FOR SEQ ID NO: 44 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acids 
<B) TYPE : amino acid 

10 (ii) MOLECULE TYPE : peptide 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 44 

Phe Ala Phe Glu Asp Pro Leu Asn 
15 1 5 8 

(2) INFORMATION FOR SEQ ID NO: 45 

2 0 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

25 

<ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 45 
30 GTGCTGATTG GTGTATTTAC AATCC 

(2) INFORMATION FOR SEQ ID NO: 46 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1859 base pairs 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 46 

GTGCTGATTG GTGTATTTAC AATCCTTTAT CTAATCCGAA ATGCCCATGT TGCAATATGG 60 
AAAGAAAGGG AGTTCCTAAC CTCTGGGGGA ACCCCCATTA AATACCACAA GTAAATCATG 120 
GAGTTATTGC ACACAGTGCA AAAACTCAAG GAGGTGGAAG TCTTACACTG CCAAAGCCAT 180 
CAGAAAAGGG AAGAGGGGAG AAGAGCAGCA TAAGTGGCTA CAGAGGCAAG GAAAGACTAG 240 
CAGAAAGGAA AGAGAGAAAG AGACAGAAAG TCAGAGAGAG AGAGAGGAAG AGACAGAGCA 300 
CAAAGAGGGA GTCAGAGAGA GAGAGAGACA GAGAGTCAGA GAGAAGGAAA GAGAGAGAGG 360 
AAGAGACAAA GAATGAATCA AACAGAGAGA CAGAAAGTCA GAGAGAGAGA GAGAGAGGAA 420 
GAGACAGAGA AAAAGAGGGA GTCAGAAAAA GAGAGACCAA AGAAGAAGTC CAAAGAGAAA 480 
GAAAGAGAGA TGGAAGTAGT AAAGGAAAAA CAGTGTACCC TATTCCTTTA AAAGCCGGGG 540 
TAAATTTAAA ACCTATAATT GATAACTGAA GGTCTTCTCT GTAACCCTGT AACACTCCAA 600 
TACCACCTTG TTGTCAAGTG TAAACAAGGG CGTAGCCCAA AAGCACTGAG GCCACTAACA 660 
ACCCATAGCC TTCCTATCAA AATTCCTTAA CCCAGCAGGT TTCCTAACAG GGGATCTAAA 720 
TCTTAATTAA TTACCATACA ATGGTCCAAC CAGACTTAGG AGGAATTCCC TTCAGGACGG 780 
GAAGATAGAT GCTTCCTCCC AGGCGATTAA GGGAGAAAGA CACAATGGGT ATTCAGTAAG 840 
TGCCAAGGGG AACACTTGTA GAAGCAAAGT TAGGAAAATT GCCAAATAAT TGGTTTGCTC 900 
AAGAGTTGTT TGCACTCAGC CAAACCTTGA AGTACTTGCA GAATCAGAAA GGAGCCATCT 960 
ATACCAATTC TAAGTTAATA TGGACTGAAG GAGGTTTTAT TAATACCAAA GAGAAATTAA 1020 
AATCCCAAAC TTATAAGGTT TTCAACCAAA GTAAAGTTTG CTAAAAGTTA ACAGCGTAAC 1080 
ATGTATTATC CTACTACCAC ACACTCTCAA AGGATTTCTC AGACAGTTTG CAAGAAATAA 1140 
TGATATCTAT CCTTACTCTA CAATCCCAAA TAGACTCTTT GGCAGCAGTG ACTCTCCAAA 1200 
ACCGTCAAGG CCTAGACCTC CTCACTGCTG AGAAAGGAGG ACTCTGCACC TTCTTAAGGG 1260 
AAGAGTGTTG TCTTTACACT AACCAGTCAG GGATAGTATG AGATGCTGCC CGGCATTTAC 1320 
AGAAAAAGGC TTCTGAAATC AGACAACGCC TTTCAAATTC CTATACCAAC CTCTGGAGTT 1380 
GGGCAACATG GTTTCTTCCC TTTCTATGTC CCATGGCTGC CATCTTGCTA TTACTCGCCT 1440 
TTGGGCCCTG TATTTTTAAC CTCCTTGTCA AATTTGTTTC TTCTAGGATC GAGGCCATCA 1500 
AGCTACAGAT GGTCTTACAA ATGGAACCCC AAATGAGCTC AACTATCAAC TTCTACTGAG 1560 
GACCCCTAGA CCAACCCCCT GGCCCTTTCA CTGGCCTAAA GAGTTCCCCT CTGGAGGACA 1620 
CTACCACTGC AGGGCCCCAT CTTTGCCCCT ATCCAGAAGG AAGTAGCTAG AGCAGTCATT 1680 




GCCCAATTCC CAAGAGCAGC TGGGGTGTCC CGTTTAGAGT GGGGATTGAG AGGTGAAGCC 1740 
AGCTGGACTT CTGGGTCGGG TGGGGACTTG GAGAACTTTT GTGTCTAGCT AAAGGATTGT 1800 
*AAATGCAACA ATCAGTGCTC TGTGTCTAGC TAAAGGATTG TAAATACACC AATCAGCAC 1859 

5 

(2) INFORMATION FOR SEQ ID NO: 47 

(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH : 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 <ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47 
TGATGTGAAC GGCATACTCA CTG 23 

20 

(2) INFORMATION FOR SEQ ID "NO: 48 

(i) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH : 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

3 0 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 48 



CCCAGAGGTT AGGAACTCCC TTTC 

35 



24 



WO 98/23755 



146 

(2) INFORMATION FOR SEQ ID NO: 49 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

5 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY ; linear 

(ii) MOLECULE TYPE : cDNA 

10 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO 
GCTAAAGGAG ACTTGTGGTT GTCAG 

15 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CAACATGGGC ATTTCGGATT AG 

30 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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15 



25 



• 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

5 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GGCTGCTAAA GGAGACTTGT GGTTGTCAGA CAATCGCCTA CTTAGGTACC AGGCCTTATT 60 

ACTTGAGGGA CTGGTGCTTC AGATGCGCAC TTGTGCAGCT CTTAACCCAA ACTTATGCTG 120 

CCCAGAAGGA TCTTTTAGAG GTCCCCTTAG CCAACCCTGA CCTCAACCTA TATATATACT 180 

10 GATGGAAGTT CGTTTGTAGA AAAGGGATTA CAAAGGGNAG GATATNCCAT AGGTTAGTGA 240 

TAAAGCAGTA CTTGAAAGTA AGCCTCTTCC CCCCAGGGAC CAGCGCCCCC GTTAGCAGAA 300 

CTAGTGGCAC TGACCCCGAG CCTTAGAACT TGGAAAGGGA GGAGGATAAA TGTGTATACA 360 

GATAGCAAGT ATGCTTATCT AATCCGAAAT GCCCATGTTG 400 



(2) INFORMATION FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2389 base pairs 
20 (B) TYPE: nucleotide 

(C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



TCAGGGATAG CCCCCATCTA TTTGGTCAGG CACTGGCCCA AGATCTAGGG ACATGCCACT 60 

TTTAAGAGCC ATTTCTCAAG TCCAGGTACT CTGGTCCTTC GGTATGTGGA TGATTTACTT 120 

3 0 TTGGCTACCA GTTCAGTAGC CTCATGCCAG CAGGCTACTC TAGATCTCTT GAACTTTCTA 180 

GCTAATCAAG GGTACAAGGC ATCTAGGTTG AAGGCCCAGC TTTGCCTACA GCAGGTCAAA 240 

TATCTAGGCC TAATCTTAGC CAGAGGGACC AGGGCACTCA GCAAGGAACA AATACAGCCT 300 

ATACTGGCTT ATCCTCACCC TAAGACATTA AAACAGTTGC GGGGGTTCCT TGGAATCACT 360 

GGCTTTTTGG TGACTATGGA TTCCCAGATA CAGCAAGATT GGCAGGCCCC TCTATACTGT 420 

35 AATCAAGGAG ACTCACGAGG GCAAGTACTC AT CT AG TAG A ATGGGAACTA GGGACAGAAA 480 

CAGCCTTCAA AACCTTAAAG CAGGCCCTAG TACAATCTCC AGCTTTAAGC CTTCCCACAG 540 



GACAAAACTT CTCTTTATAC ATCACAGAGA 
AGACTCATGG GACTACCCCA CAACCAGTGG 
CAAAAGGCTG GCCTCACTGT TTATGGGTAG 
CTATCAAAAT AATACAAGGA AAGGATCTCA 
5 ACTAGGTGCC AAAAGAAGTT TATGGGTATC 
ACTCCTGGAG GATTGGGCTT CAAGTGCGTT 
CCAGAGGATG GAGAGCCGCT TGAGCATGCT 
ACCCGAGATG ATCTCTTAGA GTACCCTTAG 
GAAGTTCATT TGTGGAAAAC GGGATATGAA 

10 TCATACTTGC AAGTAAGCCT CTTACCCCAG 
CACTTACCTT AACCTTAGAA CTGGGAAAGG 
AGTATGCTTA TCTAATCCTA CATGCCCATG 
CCCCTGGGGG AACCCCCATT AAATACCACA 
CAAAAACTCA AGGAGGTGGC AGTCTTACAC 

15 GGGAGAACAG CAGCATAAGT GGTTGGCAGA 
GAGACAACGT CAACGACAGA AGGAAAGAAG 
ACAGTTAGTC CAAGAGAGAG ACAGAGAGAG 
GGAAAGAGAG GAAGAGACCA AGGAGTCCNA 
AAAACATTGT ACCCTATTCC TTTAAAAGCC 

20 TGAGTTCTTG CACCCTCCTC CAGGGGATYG 
AATTGTGGGT CGTCCCTATG TCTCAATTAC 
CGAGGGTGTA GAGCGCAGAC AGGGAGACCT 
CCTTAACCCA GCAGGTTTTC TAAAAGGGGA 
TCAAACCAGA TCTAGGAGGA ACTTCCTTCA 

25 GATTAAAGAA AATAAAAAGA CACATGGGCA 
AGCAGTTAGG AGAAGTTGCC TAATAATTGG 
CAGCCCAAAT CTTAAAGTAC TTACAGAATT 
TAATATGGAC TGGATGAGGT TTTATTAATA 
GGTTTTCAAC TAAAGTAAAT TTTACTAAAA 

30 CAACACACTC TCANAGGATT CCTCAGACAG 
AGGATAGTAA CTACAATCCC AAATACATTC 
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GGGCAGAGAT 


AGCTCTTGGT 


GTCCTTATTC 


600 


CACACCTAAG 


TAAGGAAATT 


GATGTAGTAG 


660 


CTGTGGTGGT 


GGCTGTCTTA 


GTGTCAGAAG 


720 


CTGTCTGGAC 


TACTCATGAT 


GTAATGGCAT 


780 


AGACAACCAC 


CTGCTTAGAT 


ACCAGGGACT 


840 


TTTTGTGGCC 


TCAACCCTGC 


CACTTTTCCT 


900 


TGCCAACAGG 


TTGTAGGCCA 


GAATTATTCC 


960 


CTAATCCTGA 


CCTTAACCTA 


TATACCAATG 


1020 


GGGCAGGTTA 


TGTCATAGTT 


AGTGATGTAA 


1080 


GGGCCAGCAC 


TCAGTTAGCA 


GAACTAGTCA 


1140 


GAAAAAGAAT 


AAATATGTAT 


ACAGATAGTA 


1200 


CTGCAATATG 


GAAGGAAAGG 


GAGTTCCTAA 


1260 


AGGYAAATCA 


TGGAGTTATT 


GCACGCAGTG 


1320 


TGCCGAAGCY 


ATCAAAAAGG 


GGAAGGAGAG 


1380 


GGCAGTGAAA 


GACCAGCAGA 


GAGAAGGAGA 


1440 


AGGAGGAGAC 


AGAGAGGAAG 


AGACAGAGAG 


1500 


GAAGAGACAG 


A C AG AAAGTC 


CAAGAGAGAA 


1560 


GAGAGAGAAA 


GAGATAGAAG 


TAGTAAAGAA 


1620 


GGGGTATATT 


TAAAACCTAT 


AATTGATAAT 


1680 


CTGGGAGGAA 


ACCCTCAACC 


GATATGTGAA 


1740 


CAGCCAATAC 


CCCCTTGTTT 


TTAGTGTGAA 


1800 


CTGACAATCC 


ATACCCTTCC 


TATCCAAAAT 


1860 


TCTAAATCTT 


AATTAATTAC 


CATACAAAGG 


1920 


GGACAGGATG 


ATAGATGGTT 


CCTCCCAGGC 


1980 


GCCAGTAAGT 


GATAAGGGAA 


CACTAGTAGA 


2040 


TCTACTCCAA 


ATGTGTGAGT 


TGTTCGCACT 


2100 


AGGGAGGAGC 


CATTTACACC 


AATTCTAAGT 


2160 


GCGAAGGAGA 


ATTAAATCCT 


AAACTNACAA 


2220 


GCTAACAGTG 


TAACATGCAT 


TATCCTACTA 


2280 


TTTACAAGAA 


ATAACAAAAT 


CTATCTGGTA 


2340 


TTTGGCAGCA 


GTGACTCTC 




2389 



(2) INFORMATION FOR SEQ ID NO: 53: 

35 

(i) SEQUENCE CHARACTERISTICS: 




(A) LENGTH: 2448 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



10 


TCAGGGATAG 


CCCCCATCTA 


TTTGATCAGG 




GTCCAGGCAT 


TCTAGTCCTT 


CAGTATGTGG 




CCTCATGCCA 


GCAGGCTACT 


TGAGATCTCT 




CATCTAAATT 


GAAAGTCCAG 


CTCTGCCTAC 




ATAGAAGAAC 


CAGGGCCCTC 


AGCAAGGAAT 


15 


CTAAGACATT 


AAAACAATTG 


TGGGGGTTCC 




TCCCTGGATA 


GAGTGAGATA 


GCCAGGCCCC 




GGCAAATACT 


TATCTAGTAT 


TATGGGNACC 




GGAGACCCTA 


GTACAAGCTC 


CAGCTTTAAG 




TGTCACAGAG 


AGAGCAGGAA 


TAGCTCCTGG 


20 


CGGCCAGTGG 


CRTACCTAAG 


TAAGGAAATT 




TTATGGGTAG 


TTGCGGCTGT 


GGCAGTCTTA 




AAGGATTTCA 


CTATCTGGAC 


TACTCATGAG 




TTTTGGCTAT 


CAGACAACCA 


CCTGCTCAGA 




CTTTAAATAT 


GTATGTGTGT 


GTGTGGCCCT 


25 


AGAACCAATG 


AAGCATTACT 


GTCAACAAAT 




TCTCTTAGAA 


GTCCCCTTAG 


CTAATCCTGA 




TGTGGAGAAT 


GGGATACGAA 


AAG C AC ATT A 




AAGTAAGCCT 


ATTCCCCCAT 


GGACCAGAGC 




AGCCTTAGAA 


CTAGGAAAGG 


GAAAAATAAT 


30 


TCTAATCCTA 


CATGCCCATG 


CTGCAGTATG 




AACCCCCATT 


AAATACCACA 


AGGCAAATCA 




AGTAGGTGGC 


AGTTTTACAC 


TGCCTGAAGC 




TAAGTGGCTA 


GCAGAGGCAG 


CGAAAGACTA 




AGTCAAAGAA 


AAGAAGTCAA 


AGACAGACAG 


35 


AAAAGAGAGA 


ACGAAAGAGA 


CAGAATGTCA 




AGTTAAGAAA 


GTGAGAAAGA 


GAGATGGAAA 



CACTAGCCCA 


AGATCTAGGC 


CACTTCTGAA 


60 


ATGATTTACT 


TTTGGCTACC 


AGTTTGGAAG 


120 


TGAACTTTCT 


AGCTAATCAA 


GGGTGTATGG 


180 


AACAAGTCAA 


ATATCTAGGC 


CTAATCTTAG 


240 


GAATAAAGCC 


TATGCTGGCT 


TATCGGCACC 


300 


TTGGAATCAC 


TGGCTTTTGC 


CGACTATGGA 


360 


CTCTATTACT 


CTTATCAAGG 


AGACCCAGAG 


420 


AGAGGCAGAA 


AAAGCCTTCC 


AAACCTTAAA 


480 


CCTTCCCACA 


GGACAAANCT 


TCTCTTTATA 


540 


AGTCCTTACT 


CAGACTTTTG 


GACGACCCCA 


600 


GATGTAGTAG 


CAAAAGGCTG 


GCCTCACTGT 


660 


CTGTCAAAGG 


CTATCAAAAT 


AATACAAGGA 


720 


GAAAATGGCA 


TATTAGGTGC 


CAAAGGAAGT 


780 


TTCCAGGCAC 


TACTGATTGA 


GAGACCAGTG 


840 


CAACCCTGCC 


ACTGTTCTCC 


CAGAAGATGG 


900 


TAGAGTCCAG 


AGTTATGCTG 


CCTGAGAGGA 


960 


CCTTAACCTA 


TATGCTGATG 


GAAGTTCACT 


1020 


TGCCATAGTT 


AGTGAGGTAA 


CAGTACTTGA 


1080 


CCAGTTAGCA 


GAACTAGTGG 


CACTTACCCA 


1140 


AAATGTGTAT 


ACAGATAGCA 


AGTATGCTTA 


1200 


GAAAGAAAGG 


GAGTTCCTAA 


CCTCTGGGGG 


1260 


TGGAGTTATT 


GCATGTAGTG 


CAAAACCTCA 


1320 


TATGGGGAAG 


GAGAGAGGAG 


AACAGCAGCA 


1380 


GCAGAGAGGA 


GAGGTAGGGG 


AAAGACAGAA 


1440 


AGAAAGAGAC 


AGAGGGAGCC 


AG AG AG AAAG 


1500 


AAGAACAGAA 


GAGAGAGGCA 


GCGCCAGAAG 


1560 


TAGTAAAGAA 


AAAACAGTGT 


ACCCTATTCC 


1620 






TT^AAAAGCC 


AGGGTAAATT 


TAAAACGTAT 


AATTTTATAA 


TTGGAAGGTC 


TTCTCCATAA 


1680 




CCCTATAACA 


TTAAAATACC 


ACCTTGTTGT 


CAGTGTAAAC 


AAGAGCATAG 


CCCAAAAGCA 


1740 




CTGAGGCCAC 


TGACAACCCA 


TAGCCTTCCT 


ATCAAAAATC 


CTTAACTCTG 


CAGGTTTCCT 


1800 




AACAGGGGAT 


CTAAATCTCA 


ACTAATCACC 


ATACAATGGT 


CCGACCAGAC 


CTAGGAGCGA 


1860 


5 


CTCCCCTCAG 


GACAGAAGGA 


TGGATGGTTC 


CTCCCAGGCC 


ATTAAGGGAA 


AGAGACACAA 


1920 




TGGGTATTCA 


GTAAGTGATA 


AGGGAACTCT 


TGTAGAAGCA 


GTTAGGAAGA 


TTGCCTAATA 


1980 




TTTGGTCTGC 


TCAAATGTGC 


C AG CTGTTTG 


CACTCAGCTA 


AACCTTAAAT 


TACTTACAGA 


2040 




ATTAGGAAGG 


AGCCATCTAT 


ACCAATTCTG 


AGTTAATATG 


AGCTGAACAA 


GTTCTTATTA 


2100 




ATAGCAAAGA 


ATCATTGAAA 


TCTCAAACTT 


GCAAAGTTTT 


CAACAAAAGT 


AAAGTTTGCT 


2160 


10 


GAAAGTTAGC 


AGTGTAACAT 


GTATTATCCT 


AACTTCTAAT 


CTTGTGGAAA 


TCAGACCCTA 


2220 




TCAGTGCCCC 


TCAAAGCTGA 


AGTCCATCAG 


CATATGGCCA 


TACAACTAAT 


ACCCCTATTT 


2280 




ATAGGGTTAG 


GAATGGCCAC 


TGCTACAGGA 


ATGGGAGTAA 


CAGGTTTATC 


TACTTCATTA 


2340 




TCCTATTACC 


ACACACTCTT 


AAAGGATTTC 


TCAGACAGTT 


TACAAGAAAT 


AACAAAATCT 


2400 




ATCCTTACTC 


TNTARTCCCA 


AATAGRTTCT 


TTGGCAGCAG 


TGACTCTC 




2448 



15 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCTGAGTTCT TGCACTAACC C 21 

30 

(2) INFORMATION FOR SEQ ID NO: 55: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 



10 



20 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GTCCGTTGGG TTTCCTTACT CCT 23 

(2) INFORMATION FOR SEQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1196 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



TTCCTGAGTT CTTGCACTAA CCTCAAATGA GAGAAGTGCC GCCATAACTG CAACCCAAGA 60 

GTTTGGCGAT CCCTGGTATC TCAGTCAGGT CAATGACAGG ATGACAACAG AGGAAAGATA 120 

25 ATGATTCCCC ACAGGCCAGC AGGCAGTTCC CAGTGTAGAC CCTCATTAGG ACACAGAATC 180 

AGAACATGGA GATTGGTGCC GCAGACATTT GCTAACTTGC GTGCTAGAAG GACTAAGGAA 240 

AACTAGGAAG ATATGAATTA TTCAATGATG TCCACTATAA CACAGGGGAA AGGAAGAAAA 300 

TCCTACTGCC TTTCTGGAGA GACTAAGGGA GGCATTGAGG AAGCATACCA GGCAAGTGGA 360 

CATTGGAGGC TCTGGAAAAG GGAAAAGTTG GGAAAAGTAT ATGTCTAATA GGGCTTGCTT 420 

30 CCAGTGTGGT CTACAAGGAC ACTTTAAAAA AGATTGTCCA ATAGAAATAA GCCACCACCT 480 

CGTCCATGCC CCTTATGTCA AGGGAATCAC TGGAAGGCCC ACTGCCCCAG GGGATGAAGG 540 

TCCTCTGAGT CAGAAGCCAC TAACCAGATG ATCCAGCAGC AGGACTGAGG GTGCCCGGGG 600 

CAAGCGCCAG CCCATGCCAT CACCCTCACA GAGCCCCAGG TATGCTTGAC CATTGAGGGT 660 

CAGAAGGGTA CTGTCTCCTG GACACTGGCG GGCCTTCTCA GTCTTACTTT CCTGTCCTGG 720 

35 ACAACTGTCC TCCAGATCTG TCACTGTCCG AGGGGTCCTA GGACAGCCAG TCACTAGATA 780 

CTTCTCCCAG CCACTAAGTT GTGACTGGGG AACTTTACTC TTCCACATGC TTTTCTAATT 840 




ATGCCTGAAA GCCCCACTCT CTTGTTAGGG 
TATACATGTG AATATAGGAG AAGGAACAAC 
TAATCCTGAA GTCCGGGCAA CAGAAGGACA 
TCAAGTTAAA CTAAAGGATT CCACCTCCTT 
5 CGAGACCCAA CAAGAACTCC AAAAGATTGT 
ACCAAGCAAT AGCCCTTGCA AGACTCCAAT 




152 

GAGAGACATT CTAGCAAAAG CAGGGGCCAT 900 
TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 960 
ATATGGACAA GCAAAGAATG CCCGTCCTGT 1020 
TCCCTACCAA AGGCAGTACC CCCTCAGACC 1080 
AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 1140 
TTTAGGAGTA AGGAAACCCA ACGGAC 1196 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2391 base pairs 

( B ) TYPE : nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
0 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


CATCACCCTC 


60 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


GTNACTGTCT 


CCTGGACACT 


120 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


180 


GTCCGAGGGG 


TCCTAGGACA 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


240 


TGGGGAACTT 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


AGGAGAAGGA 


360 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


CTGAAGTCCG 


GGCAACAGAA 


420 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


480 


TCCTTTCCCT 


ACCAAAGGCA 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


540 


ATTGTAAAGG 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 


CCAATTTTAG 


GAGTAAGGAA 


ACCCAACGGA 


CAGTGGAGGT 


TAGTGCAAGA 


ACTCAGGATT 


660 


ATCAATGAGG 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


ACCCTTATAC 


AGTGCTTTCC 


720 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


GTCCTGGACC 


TTAAGGATGC 


CTTTTTCTGC 


780 


ATCCCTGTAC 


GTCCTGACTC 


TCAATTCTTG 


TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


840 


TCTCAACTCA 


CCTGGACTGT 


TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 


CAGGCATTAG 


CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


CCTTCAGTAC 


960 
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ATGGATGATT TACTTTTAGT CGCCCGTTCA GAAACCTTGT GCCATCAAGC CACCCAAGAA 1020 
CTCTTAACTT TCCTCACTAC CTGTGGCTAC AAGGTTTCCA AACCAAAGGC TCGGCTCTGC 1080 
TCACAGGAGA TTAGATACTN AGGGCTAAAA TTATCCAAAG GCACCAGGGC CCTCAGTGAG 1140 
GAACGTATCC AGCCTATACT GGCTTATCCT CATCCCAAAA CCCTAAAGCA ACTAAGAGGG 1200 
5 TTCCTTGGCA TAACAGGTTT CTGCCGAAAA CAGATTCCCA GGTACASCCC AATAGCCAGA 12 6Q^ 
CCATTATATA CACTAATTAN GGAAACTCAG AAAGCCAATA CCTATTTAGT AAGATGGACA 1320 
CCTACAGAAG TGGCTTTCCA GGCCCTAAAG AAGGCCCTAA CCCAAGCCCC AGTGTTCAGC 1380 
TTGCCAACAG GGCAAGATTT TTCTTTATAT GCCACAGAAA AAACAGGAAT AGCTCTAGGA 1440 
GTCCTTACGC AGGTCTCAGG GATGAGCTTG CAACCCGTGG TATACCTGAG TAAGGAAATT 1500 
10 GATGTAGTGG CAAAGGGTTG GCCTCATNGT TTATGGGTAA TGGNGGCAGT AGCAGTCTNA 1560 
GTATCTGAAG CAGTTAAAAT AATACAGGGA AGAGATCTTN CTGTGTGGAC ATCTCATGAT 1620 
GTGAACGGCA TACTCACTGC TAAAGGAGAC TTGTGGTTGT CAGACAACCA TTTACTTAAN 1680 
TATCAGGCTC TATTACTTGA AGAGCCAGTG CTGNGACTGC GCACTTGTGC AACTCTTAAA 1740 
CCCAAACTTA TGCTGCCCAG AAGGATCTTT NTAGAGGTCC CCTTAGCCAA CCCTGACCTC 1800 
5 AACTATATAT ATACTGATGG AAGTTCGTTT GTAGAAAAGG GATTACAAAG GGNAGGATAT 1860 
NCCATAGGTG TTAGTGATAA AGCAGTACTT GAAAGTAAGC CTCTTCCCCC CCAGGGACCA 1920 
GCGCCCCCGT TAGCAGAACT AGTGGCACTG ACCCCGCGAG CCTTAGAACT TTGGAAAGGG 1980 
AGGAGGATAA ATGTGTATAC AGATAGCAAG TATGCTTATC TAATCCGAAA TGCCCATGTT 2040 
GTTTATCTAA TCCGAAATGC CCATGTTGCA ATATGGAAAG AAAGGGAGTT CCTAACCTCT 2100 
0 GGGGGAACCC CCATTAAATA CCACAAGTTA ATCATGGAGT TATTGCACAC AGTGCAAAAA 2160 
CTCAAGGAGG TGGgAGTCTT ACACTGCCAA AGCCATCAGA AAAGGGAAAG GGGAGAAGAG 2220 
CAGCATAAGT GGCTACAGAG GCAAGGAAAG ACTAGCAGAA AGGAAAGAGA GAAAGAGACA 2280 
GAAAGTCAGA GAGAGAGAGA GGAAGAGACA GAGCACAAAG AGGGAGTCAG AGAGAGAGAG 2340 
AGACAGAGAG TCAGAGAGAA GGAAAGAGAG AGAGGAAGAG ACAAAGAATG A 2391 

!5 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1722 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: cDNA 




(xi) SEQUENCE DESCRIPTION: SEQ -ID NO: 58: 

TGGAGAATAG CAGCATAAGT TGGCTGGCAG AAGTAGGGAA AGACAGCAAG AAGTAAAGAA 60 

AAAAARGAGA AAGTCAGAGA AAGAAAAAAA GAGAGGAAGA AACAAAGAAG AACTTGAAGA 120 

5 GAGAAAGAAG TAGTAAAGAA AAAACAGTAT ACCCTATTCC TTTAAAAGCC AGGGTAAATT 180 

TCTGTCTACC TAGCCAAGGC ATATTCTTCT TATGTGGAAC AT C AACCT AT ATCTGCCTCC 240 

CCACTAACTG GACAGGCACC TGAACCTTAG TCTTTCTAAG TCCCAACATT AACATTGCCC 300 

CAGGAAATCA GACCCTATTG GTACCTGTCA AAGCTAAAGT CCCGTCAGTG CAGAGCCATA 360 

CAACTAATAT CCCTATTTAT AGGGTTAGGA ATGGCTACTG CTACAGGAAC TGGAATAGCC 420 

10 GGTTTATCTA CTTCATTATC CTACTACCAT ACACTCTCAA AGAATTTCTC AGACAGTTTG 480 

CAAGAAATAA TGAAATCTAT TCTTACTTTA CAATCCCAAT TAGACTCTTT GGCAGCAATG 540 

ACTCTCCAAA ACCGCCGAGG CCCACACCTC CTCACTGCTG AGAAAGGAGG ACTCTGCACC 600 

TTCTTAGGGG AAGAGTGTTG TTTTTACACT AACCAGTCAG GGATAGTACG AGATGCCACC 660 

TGGCATTTAC AGGAAAGGGC TTCTGATATC AGACAATGCC TTTCAAACTC TTATACCAAC 720 

15 CTCTGGAGTT GGGCAACATG GCTTCTTCCA TTTCTAGGTC CCATGGCAGC CATCTTGCTG 780 

TTACTCACCT TTGGGCCCTG TATTTTTAAG CTTCTTGTCA AATTTGTTTC CTCTAGGATC 840 

GAAGCCATCA AGCTACAGAT GGTCTTACAA ATGGAACCCC AAATGAGTTC AACTAACAAC 900 

TTCTACCAAG GACCCCTGGA ACGATCCACT GGCACTTCCA CTAGCCTAGA GATTCCCCTC 960 

TGGAAGACAC TACAACTGCA GGGCCCCTTC TTTGCCCCTA TCCAGCAGGA AG TAG CT AG A 1020 

20 GCGGTCATCG GCCAAATTCC CAACAGCAGT TGGGGTGTCC TGTTTAGAGG GGGGATTGAA 1080 

GAGGTGACAG CCTGCTGGCA GCCTCACAGC CCTCGTTGGY TCTCAGTGCC TCCTCAGCCT 1140 
TGGTGCCCJVC TCTGGCCGTG CTTGAGGAGC CCTTCAGCCT GCCACTGCAC TGTGGG AGCC ~ 1200 

TCTTTCTGGG CTGGACAAGG CCGGAGCCAG CTCCCTCAGC TTGCAGGGAG GTATGGAGGG 1260 

AGAGATGCAG GCGGGAACCA GGGCTGCGCA TGGCGCTTGC GGGCCAGCAT GAGTTCCAGG 1320 

25 TGGGCGTGGG CTCGGCGGGC CCCACACTCG GGCAGTGAGG GGCTTAGCAC CTGGGCCAGA 1380 

CAGATGCTGT GCTCAACTTC TTCGCTGGGC CTTAGCTGCC TTCCCCGTGG GGCAGGGCTY 1440 

CGGGAACMTG CAGCCTGCCC ATGCTTGAGC CCCCCACCCC GCCGTGGGTT CYTGCACAGC 1500 

CCAAGCTTCC CGGACAAGCA CCACCCCTTA TCCACGGTGC CCAGTCCCAT CAACCACCCA 1560 

AGGGTTGAGG AGTGCGGGCA CACAGCGCGG GATTGGCAGG CAGTTCCACT TGCGGCCTTG 1620 

30 GTGCGGGATC CACTGCGTGA AGCCAGCTGG GCTCCTGAGT CTGGTGGGGA CTTGGAGAAT 1680 

CTTTATGTCT AGCTAAGGGA TTGTAAATAC ACCAATCAGC AC 1722 

(2) INFORMATION FOR SEQ ID NO: 59: 

35 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH; 495 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



CTTCCCCAAC 


TAATAAGGAC 


CCCCCTTTCA 


ACCCAAACAG 


TCCAAAAGGA 


CATAGACAAA 


60 


GGAGTAAACA 


ATGAACCAAA 


GAGTGCCAAT 


ATTCCCTGGT 


TATGCACCCT 


CCAAGCGGTG 


120 


GGAGAAGAAT 


TCGGCCCAGC 


CAGAGTGCAT 


GTACCTTTTT 


CTCTCTCACA 


CTTGAAGCAA 


180 


AT T AAAAT AG 


ACNTAGGTNA 


ATTNTCAGAT 


AGCCCTGATG 


GYTATATTGA 


tgttttAcaa 


240 


GGATTAGGAC 


AATCCTTTGA 


TCTGACATGG 


AGAGATATAA 


TATTACTGCT 


AAATCAGACG 


300 


CTAACCTCAA 


ATGAGAGAAG 


TGCTGCCATA 


ACTGGAGCCC 


GAGAGTTTGG 


CAATCTCTGG 


360 


TATCTCAGTC 


AGGTCAATGA 


TAGGATGACA 


ACGGAGGAAA 


GAGAACGATT 


CCCCACAGGG 


420 


CAGCAGGCAG 


TTCCCAGTGT 


AGCTCCTCAT 


TGGGACACAG 


AATCAGAACA 


TGGAGATTGG 


480 


TGCCGCAGAC 


ATTTA 










495 



20 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2503 base pairs 

25 ( B ) TYPE : nucleot ide 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

CCAAGAACCC ACCAATTCCG GANCACATTT TGGCGACCAC GAAGGGACTT TCGCATATCG 60 

CCAAGCGGTG AGACAATAGC CGAGCGGTGA GACCTTTCCC AATCGCCAAG CAGTGAGTAC 120 

35 CATCAGACCC CTTTCACTTG CTATTCTGTC CTATCTTTCT TTAGAATTCG GGGGCTAAAT 180 

ACCGGGCATC TGTCAGCCAT TTAAAAGTGA CTAGCGGGCC GCCGGACTAA AGACACGGGT 240 



VY\J ^OfMJ fZ>Z> 



m 

GTCAAGCTTT CTGGGAAAGG GCTCTCTAAC 
GGTTTGCCTA GAACCAGCTT CCGCTTTTCC 
GTGAAGGAAA GCCATGCATC TCCGGGGTCT 
GAGTGGAACT CTCAAAAGCA TGTCGCCCAA 
5 TGACCCTTGC CCTCTGGGTC CTAATGCCTG 
TGAAGCTAGA ACCGCTTCTA AAAATTGCTA 
CTATAAAGAA TGAWTTCTAG TATTAAACTC 
GGCTCACCAA TCAGAAAGAC ACAGTTTTTG 
GGAATTTTAG GATCCCTCCT CAGACTAACA 

10 ATATGGGGAG CCTCAGAAAT TGTATCCCTC 
ACTCTTCCAA CCCTGAAGAT CCCCTCCCTC 
GTGGCATAAC ATCTTTATAG GATGGGGTAA 
ACTCTAACAG GTTTTTGAGA ATGCGTCAGT 
GGTCCTCCTT GTGGTCTAGG AGGACAGGCA 

15 TAAGGACCAC TAAATCCGAC CTTCCTCGGT 
TTTCTGCTGC TGCGTCGGTG AGCGCAACTA 
AGGTTCTTGG GCAGGGGTTG TTTCTGCTGC 
CAGGGTCCCA GGACCATTGC AGGTCCTTGG 
GTGGGCGGTT TTGTCTTTCA TATGGGAAAC 

2 0 TGCATCCTAA GCCATTGGGA CCAATTTGAC 

TTTTCCTGCA CTACGGCTTG GCCCCAATAT 
GAGGGAAGCA CAAATTACAA TAYTATCCTA 
AAATGGAGTG AATACCTTAT GTCCAAGCTT 
GCAAAGCTTG CAATTTACAT CCCACAGGAG 

25 TCCCTATAGC TTCCCTTCCT ATTGATGATA 
AAATAAGCAA AGAAATCTCC AAAGGTCCAC 
TCAAGYTGTA GGGGGAGGGG AATTTGGCCC 
GATTTAAAGC AGATCAAGGC AGACCTGGGG 
GATGTCCTAC AGGGTCTAGG GCAAACCTTT 

30 TTAGATCAAA CCCTGGCCTT TAATGAAAAG 
GGAGATACCT GGTATCCTAG TCAAGTAAAT 
TTCCTTACTG GTCAGCAACC CATCCCCAGT 
CATGGGGACT GGAGTCGTAA ACATCTGTTG 
GGGAAAAAGC CCATGAATTA TTCAATGATA 

3 5 CCTTCTGCCT TCCTCGAGCG GCTACAAGAG 

GAATCACTCG AGGGTCAATT GATTCTAAAA 
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AACCCCCAAC TCTTTGGAGT TGGGACCGTT 300 
TGTACTTCTG GGCTGAGCCG TGGGTTGACA 360 
CGMCAACATG TTGGTTGACC CTGCGGCCAT 420 
GCGACACTCG CCTATCTATC CTATCTATCC 480 
CCAGACAAAC TTCCTCTCGC CTCTCTTCTC 540 
CCTGGTCTCT GGTGCTTTTC CTARTTTCTC 600 
CAGGACTCTG TTACCTTCTT TAGGCACCCG 660 
CCCAAGGCCC CATCGTAGTG GGGACTACCT 720 
GGCCTAACAA AAGTTATTCC TGAAGCTAGG 780 
CTATTCATAT AAGTGAGAAC AAAAGGTGTC 840 
CCTCAGGGTA TGGCCCTCCA TTTCATTTTT 900 
AGTCCCAATA CTAACAGGAG AATGCTTAGG 960 
AAGGGCCACT AAATCTGATT TTTCTCAGTC 102 0 
AGGTTGTGCA GGTTTTCGAG AATGCGTCAG 1080 
CCTCCATGTG GTCTGGGAGG AAAACTAGTG 1140 
TTCAAGTCAG CAGGGTCCAG GGACCGTTGC 1200 
TGCATTGGTG AATGCAACTA TTCTGATCAG 1260 
GCAGGGAGAG AAACAAAACA AACCAAAACT 1320 
ACTCAGGCAT CAACAGGTTC ACCCTTGAAA 1380 
CCACAAACCC TGAAAAAGAG GAGGCTCATT 1440 
TCTCTTTYTG ATGGGGAAAA ATGG CCACCT 1500 
CAGCYTGATC TTTTCTGTAA GAGGGAAGGC 1560 
TCTTTTCATT GAGGGAGAAT ACACAACTAT 1620 
GACCCTTCAG CTTACCCCCA TATCCTAGCC 1680 
CTCCTCCTCT AATCTCCCCT GCCCAGAAGG 1740 
AAAAACCCCC GGGCTATCGG TTATGTCCCT 1800 
AACCCGGGTG CATGTCCCTT CTCCCTCTCT 1860 
AAGTTTTCAG ATGATCCTGA TAGGTACATA 1920 
GACCTCACTT GGAGAGACGT CATGCTACTG 1980 
AATGCGGCTT TAGCTGCAGC CTGAGAGTTT 2040 
GAAAGAATGA CAGCCGAAGA AAGGGACAAC 2100 
ATGGATCCCC ACTGGGACTT TGACTCAGAT 2160 
ATCTGTGTTC TGGAAGGACT AAGGAGAATT 2220 
TCCACCATAA CCCAGGGAAA GGAAGAAAAT 2280 
GCCTTAAGAA AATATACTCC CCTGTCACCC 2340 
GATAAGTTTA TTACCCAATC AGCCACAGAT 2400 
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ATCAGGAGAA AGCTCCAAAA GCAAGCCCTG AGCCTGAACA AAATCTAGAG ACATTATTAA 2460 
ACCTGGCAAC CTTGGTGTTC TATAATAGGG ACCAAGAGGA ACA 2503 



5 (2) INFORMATION FOR SEQ ID NO: 61: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



15 (xi) SEQUENCE DESCRIPTION: S 

AAGGAAACTC AGAAAGCCAA TACCCATTTA 
TTCCAGGCCC TAAAGAAATC CCTAACCCAA 
GACTTTTCTT TATATGTCAC AGAAAAACAG 

20 AAGGGACAAG CTTGCAACCT GTGGCATACC 
GTTGGCCTCA TTGTTTACAG GTAGGGCAGC 
AATAATACAG GGAAGAGATC TTACTGTGTG 
TGCTAAAGAG GACTTGTGGC TGTCAGACAA 
TGAAGTGCCA GTGCTGCGAC TGCACATTTG 

25 AGACAATGAA GAAAAGATAG AACATAACTG 
TCGAGGGGAC CTTCTAGAGG TTCCCTTGAC 
AAGTTCCTTG GCAGAAAAAG GACTTTGAAA 
AATACTTGAA AGTAATCGCC TCACTCCAGG 
CCTCACTTGG GCACTAGAAT TAGGAGAAGG 

30 GTATGCTTAC CTAGTCCTCC ATGCCCATGC 
TTCTGAGGGA ACACCTATCA ACCATCAGGG 
AGAAACCTAA AGAGGTGGCA GTCTTACACT 
AAATAGAAGG CAATCGCCAA GCGGATATTG 
CATTAGAAAT GCTTATAGAA GGACCCCTAG 

35 CCCAGTACTC AGCAGGAAAA ATAGAATAGG 
CCAGATGGCT AGCCACTGAG GAAGGAA 



ID NO: 61: 



GTAAGATGGA 


CACCAGAAGC 


AGAAGCAGCT 


60 


GCCCCAGTGT 


TAAGCTTGCC 


AACGGGGCAA 


120 


GAATAGCTCT 


AGGAGTCCTT 


ACACAGGTCC 


180 


TGAGTAAGGA 


AACTGATGTA 


NTGGCAAAGG 


240 


AGTAGCAGTC 


TTAGTTTCTG 


AAACAGTTAA 


300 


GACATCTCAT 


GATGTGAACG 


GCATACTCAC 


360 


CCATTTACTT 


AAATAGCAGG 


TTCTATTACT 


420 


TGCAACTCTT 


AACCCAGCCA 


CATTTCTTCC 


480 


TCAACAAGTA 


ATTGCTCAAA 


CCTATGCTGC 


540 


TGATCCCGAC 


CTCAACTTGT 


ATACTGATGG 


600 


AGCGGGGTAT 


GCAGTGATCA 


GTGATAATGG 


660 


AACTAGTGCT 


CACCTGGCAG 


AACTAATAGC 


720 


AAAAAGGGTA 


AATATATATT 


CAGACTCTAA 


780 


AGCAATATGG 


AGAGAGAGGG 


AATTCCTAAC 


840 


AAG CCATT AG 


GAGATTATTA 


TTGGCTGTAC 


900 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


960 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


1020 


TATGGGGTAA 


TCCCCTCTGG 


GAAACCAAGC 


1080 


AAACCTCACA 


AGGACATACT 


TTCCTCCCCT 


1140 



1167 




(2) INFORMATION FOR SEQ ID NO: 62: 

5 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: CDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

15 TCCAAAGGCA CCAGGGCCCT CAGTGAGGAA CGTATCCAGC CTATACTGGC TTATCCTCAT 60 
CCCAAAACCC TAAAGCAA 78 



(2) INFORMATION FOR SEQ ID NO: 63 

20 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 26 amino acids 

( B ) TYPE : amino acid 

25 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 63 

Ser Lys Gly Thr Arg Ala Leu Ser Glu Glu Arg He Gin Pro He Leu 
30 1 5 10 15 

Ala Tyr Pro His Pro Lys Thr Leu Lys Gin 
20 25 



35 (2) INFORMATION FOR SEQ ID NO: 64: 
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( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

( B ) TYPE : nucleot ide 

(C) STRANDED NESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

10 

AAATGTCTGC GGCACCAATC TCCATGTT 28 
(2) INFORMATION FOR SEQ ID NO: 65: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

25 

AAGGGGCATG GACGAGGTGG TGGCTTATTT 30 



(2) INFORMATION FOR SEQ ID NO: 66: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



160 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GGAGAAGAGC AGCATAAGTG G 21 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GTGCTGATTG GTGTATTTAC AATCC 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GACTCGCTGC AGATCGATTT TTTTTTTTTT TTTT 



(2) INFORMATION FOR SEQ ID NO: 69: 



m 
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(t) SEQUENCE cHaj*hCxcRISTIC5j 

(A) LENGTH: 30 base pairs 

( B ) TYPE : nuc leot ide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 

10 

GCCATCAAGC CACCCAAGAA CTCTTAACTT 



(2) INFORMATION FOR SEQ ID NO: 70: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 
25 CCAATAGCCA GACCATTATA TACACTAATT 



(2) INFORMATION FOR SEQ ID NO: 71: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
GCCATAACTG CAACCCAAGA GTT 23 



5 

(2) INFORMATION FOR SEQ ID NO: 72: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



GGACGAGGTG GTGGCTTATT TCT 23 

20 

(2) INFORMATION FOR SEQ ID NO: 73: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
AACTTGCGTG CTAGAAGGAC TAAGG 



35 

(2) INFORMATION FOR SEQ ID NO: 74: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

AACTTTTCCC TTTTCCAGAT CCTC 

15 (2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

( B ) TYPE : nuc leot ide 

20 ( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

GCATACCAGG CAAGTGGACA TT 



30 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

/ D \ «"PVTJT « nnrlanf lHo 
^ u / — w — — — 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



164 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76 
CTGTCCGTTG GGTTTCCTTA CTCCT 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 
GAGGCTCTGG AAAAGGGAAA AGTT 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

jvi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 



CTGTCCGTTG GGTTTCCTTA CTCCT 



• 
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(2) INFORMATION FOR SEQ ID NO: 79: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

15 AGGAGTAAGG AAACCCAACG GACAG 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
TGTATATAAT GGTCTGGCTA TTGGG 

30 

(2) INFORMATION FOR SEQ ID NO: 81: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 
AGGAGTAAGG AAACCCAACG GACAG 

10 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
TTCGGCAGAA ACCTGTTATG CCAAGG 



25 



(2) INFORMATION FOR SEQ ID NO: 83: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
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CTCGATTTCT TGCTGGGCCT TA 22 

5 (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

GTTGATTCCC TCCTCAAGCA 20 



20 (2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

CTCTACCAAT CAGCATGTGG 20 



35 (2) INFORMATION FOR SEQ ID NO: 86: 



WO 98/23755 



10 
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(i) SEQUENCE CHARACXiSKlSiXics : 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
TGTTCCTCTT GGTCCCTAT 19 



(2) INFORMATION FOR SEQ ID NO: 87: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 aminoacids 

(B) TYPE: aminoacid 

20 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
Met Ala Thr Ala Thr Gly Thr Gly He Ala Gly Leu Ser Thr Ser Leu 
15 10 15 

25 Ser Tyr Tyr His Thr Leu Ser Lys Asn Phe Ser Asp Ser Leu Gin Glu 

20 25 30 

He Met Lys Ser He Leu Thr Leu Gin Ser Gin Leu Asp Ser Leu Ala 

35 40 45 

Ala Met Thr Leu Gin Asn Arg Arg Gly Pro His Leu Leu Thr Ala Glu 

30 50 55 60 

Lys Gly Gly Leu Cys Thr Phe Leu Gly Glu Glu Cys Cys Phe Tyr Thr 
65 70 75 80 

Asn Gin Ser Gly He Val Arg Asp Ala Thr Trp His Leu Gin Glu Arg 

oc <9CJ 95 

35 Ala Ser Asp He Arg Gin Cys Leu Ser Asn Ser Tyr Thr Asn Leu Trp 

100 105 H° 
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Ser Trp Ala Thr Trp Leu Leu Fro Fhe Leu Gly Pro Met Ala Ala lis 

115 120 125 

Leu Leu Leu Leu Thr Phe Gly Pro Cys lie Phe Lys Leu Leu Val Lys 

130 135 140 

Phe Val Ser Ser Arg lie Glu Ala He Lys Leu Gin Met Val Leu Gin 

145 150 155 160 

Met Glu Pr^ Gln Met Ser Ser Thr Asn Asn Phe Tyr Gin Gly Pro Leu 

165 170 175 

Glu Arg Ser Thr Gly Thr Ser Thr Ser Leu Glu He Pro Leu Trp Lys 

180 185 190 

Thr Leu Gin Leu Gin Gly Pro Phe Phe Ala Pro He Gin Gin Glu Val 

195 200 205 

Ala Arg Ala Val He Gly Gin He Pro Asn Ser Ser Trp Gly Val Leu 

210 215 220 

Phe Arg Gly Gly He Glu Glu Val Thr Ala Cys Trp Gin Pro His Ser 

225 230 235 240 

Pro Arg Trp Xaa Ser Val Pro Pro Gin Pro Trp Cys Pro Leu Trp Pro 

245 250 255 

Cys Leu Arg Ser Pro Ser Ala Cys His Cys Thr Val Gly Ala Ser Phe 

260 265 270 

Trp Ala Gly Gin Gly Arg Ser Gin Leu Pro Gin Leu Ala Gly Arg Tyr 

275 280 285 

Gly Gly Arg Asp Ala Gly Gly Asn Gin Gly Cys Ala Trp Arg Leu Arg 

290 295 300 

Ala Ser Met Ser Ser Arg Trp Ala Trp Ala Arg Arg Ala Pro His Ser 

305 310 315 320 

Gly Ser Glu Gly Leu Ser Thr Trp Ala Arg Gin Met Leu Cys Ser Thr 

325 330 335 

Ser Ser Leu Gly Leu Ser Cys Leu Pro Arg Gly Ala Gly Leu Arg Glu 

340 345 350 

Xaa Ala Ala Cys Pro Cys Leu Ser Pro Pro Pro Arg Arg Gly Phe Leu 

355 360 365 

His Ser Pro Ser Phe Pro Asp Lys His His Pro Leu Ser Thr Val Pro 

^-7t; 380 
j i\j - * - 

Ser Pro He Asn His Pro Arg Val Glu Glu Cys Gly His Thr Ala Arg 
385 390 395 400 
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— ^ _ - it- "» r>—~ t A «« 7i l a Ma T.on Val Am Ann Pro Leu Ara 
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405 410 415 

Glu Ala Ser Trp Ala Pro Glu Ser Gly Gly Asp Leu Glu Asn Leu Tyr 
420 425 430 

Val 
433 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

20 

CTTCCCCAAC TAATAAGGAC CCCCCTTTCA ACCCAAACAG TCCAAAAGGA CATAGACAAA 60 
GGAGTAAACA ATGAACCAAA GAGTGCCAAT ATTCCCTGGT TATGCACCCT CCAAGCGGTG 120 
GGAGAAGAAT TCGGCCCAGC CAGAGTGCAT GTACCTTTTT CTCTCTCACA CTTGAAGCAA 180 
ATTAAAATAG ACNTAGGTNA ATTNTCAGAT AGCCCTGATG GYTATATTGA TGTTTTACAA 240 

25 GGATTAGGAC AATCCTTTGA TCTGACATGG AGAGATATAA TATTACTGCT AAATCAGACG 300 
CTAACCTCAA ATGAGAGAAG TGCTGCCATA ACTGGAGCCC GAGAGTTTGG CAATCTCTGG 360 
TATCTCAGTC AGGTCAATGA TAGGATGACA ACGGAGGAAA GAGAACGATT CCCCACAGGG 420 
CAGCAGGCAG TTCCCAGTGT AGCTCCTCAT TGGGACACAG AATCAGAACA TGGAGATTGG 480 
TGCCGCAGAC ATTTACTAAC TTGCGTGCTA GAAGGACTAA GGAAAACTAG GAAGACTATG 540 

30 AATTATTCAA TGATGTCCAC TATAACACAG GGGAAAGGAA GAAAATCCTA CTGCCTTTCT 600 
GGAGAGACTA AGGGAGGCAT TGAGGAAGCA TACCAGGCAA GTGGACATTG GAGGCTCTGG 660 
AAAAGGGAAA AGTTGGGCAA ATTGAATGCC TAA 693 



35 (2) INFORMATION FOR SEQ ID NO: 89: 




(A) LENGTH: 1577 base pairs 

(B) TYPE: nucleotide 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

10 



AACTTGCGTG 


CTAGAAGGAC 


TAAGGAAAAC 


TAGGAAGACT 


ATGAATTATT 


CAATGATGTC 


60 


CACTATAACA 


CAGGGGAAAG 


GAAGAAAATC 


CTACTGCCTT 


TCTGGAGAGA 


CTAAGGGAGG 


120 


CATTGAGGAA 


GCATACCAGG 


CAAGTGGACA 


TTGGAGGCTC 


TGGAAAAGGG 


AAAAGTTGGG 


180 


CAAATTGAAT 


GCCTAATAGG 


GCTTGCTTCC 


AGTGCAGTCT 


ACAAGGACGC 


TTTAGAAAAG 


240 


ATTGTCCAAG 


TAGAAATAAG 


CCGCCCCTCG 


TCCATGCCCC 


TTATGTCAAG 


GGAATCACTG 


300 


GAAGGCCTAC 


TGCCCCAGGG 


GACGAAGGTC 


CTCTGAGTCA 


GAAGCCACTA 


ACCTGATGAT 


360 


CCAGCAGCAG 


GACTGAGGGT 


GCCCGGGGCA 


AGTGCCAGCC 


CATGCCATCA 


CCCTCAGAGC 


420 


CCCGGGTATG 


TTTGACCATT 


GAGAGCCAGG 


AAGTTAACTG 


TCTCCTGGAC 


ACTGGCGCAG 


480 


CCTTCTCAGT 


CTTACTTTCC 


TGTCCCAGAC 


AATTGTCCTC 


CAGATCTGTC 


ACTATCCGAG 


540 


GGGTCCTAAG 


ACAGCCAGTC 


ACTACATACT 


TCTCTCAGCC 


ACTAAGTTGT 


GACTGGGGAA 


600 


CTTTACTCTT 


TTCACATGCT 


TTTCTAATTA 


TGCCTGAAAG 


CCCCACTCCC 


TTGTTAGGGA 


660 


GAGACATTTT 


AGCAAAAGCA 


GGGGCCATT A 


TACACCTGAA 


CATAGGAAAA 


GGAATACCCA 


720 


TTTGCTGTCC 


CCTGCTTGAG 


GAAGGAATTA 


ATCCTGAAGT 


CTGGGCAATA 


GAAGGACAAT 


780 


ATGGACAAGC 


AAAGAATGCC 


CGTCCTGTTC 


AAGTTAAACT 


AAAGGATTCT 


GCCTCCTTTC 


840 


CCTACCAAAG 


GAAGTACCCT 


CTTAGACCCG 


AGGCCCTACA 


AGGACTCAAA 


AGATTGTTAA 


900 


GGACCTAAAA 


GCCCAAGGCC 


TAGTAAAACC 


ATGCAGTAGC 


CCCTGCAATA 


CTCCAATTTT 


960 


AGGAGTAAGG 


AAACCCAACG 


GACAGTGGAG 


GTTAGTGCAA 


GATCTCAGGA 


TTATTAATGA 


1020 


GGCTGTTTTT 


CCTCTATACC 


CAGCTGTATC 


TAGCCCTTAT 


ACTCTGCTTT 


CCCTAATACC 


1080 


AGAGGAAGCA 


GAGTAGTTTA 


CAGTCCTGGA 


CCTTAAGGAT 


GCCTCTTTCT 


GCATCCCTGT 


1140 


ACATCCTGAT 


TCTCAATTCT 


TGTTTGTCTT 


TGAAGATCCT 


TTGAACCCAA 


TGTCTCAATT 


1200 


CACCTGGACT 


GTTTTACCCC 


AGGGGTTCCG 


GGATAGCCCC 


CATCTATTTG 


GCCAGGCATT 


1260 


AGCCCAAGAC 


TTGAGCCAAT 


TCTCATACCT 


GGACATCTTG 


TCCTTCGGTA 


TGGGATGATT 


1320 


TAATTTTAGC 


CACCCGTTCA 


GAAACCTTGT 


GCCATCAAGC 


CACCCAAGCG 


TTCTTAAATT 


1380 




CZTfZTaGCTAC 


AAGGTTTCCA 


AACCAAAGGC 


TCAGCTCTGC 


TCACAGCAGG 


1440 


TTAAATACTT 


AGGGTTAAAA 


TTATCCAAAG 


GCACCAGGGC 


CCTCTGTGAG 


GAATGTATCC 


1500 


AACCTGTACT 


GGCTTATCTT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAAGG 


TCCTTGGCAT 


1560 



# 
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AACAGGTTTC TGCCGAA 



(2) INFORMATION FOR SEQ ID NO: 90: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 



1 577 



10 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Ser Ser Arg Thr Glu Gly Ala Arg Gly Lys Cys Gin Pro Met Pro 
15 1 5 10 15 

Ser Pro Ser Glu Pro Arg Val Cys Leu Thr lie Glu Ser Gin Glu Val 

20 25 30 

Asn Cys Leu Leu Asp Thr Gly Ala Ala Phe Ser Val Leu Leu Ser Cys 
35 40 45 

20 Pro Arg Gin Leu ser Ser Arg Ser Val Thr He Arg Gly Val Leu Arg 

50 55 60 

Gin Pro Val Thr Thr Tyr Phe Ser Gin Pro Leu Ser Cys Asp Trp Gly 
65 70 75 80 

Thr Leu Leu Phe Ser His Ala Phe Leu lie Met Pro Glu Ser Pro Thr 
25 85 90 95 

Pro Leu Leu Gly Arg Asp He Leu Ala Lys Ala Gly Ala He He His 

100 105 HO 
Leu Asn He Gly Lys Gly He Pro He Cys Cys Pro Leu Leu Glu Glu 
115 120 125 
30 Gly He Asn Pro Glu Val Trp Ala He Glu Gly Gin Tyr Gly Gin Ala 
130 135 140 
Lys Asn Ala Arg Pro Val Gin Val Lys Leu Lys Asp Ser Ala ser Phe 
145 150 155 160 
m oi- *~r, T,,a Tvr Pro Leu Ara Pro Glu Ala Leu Gin Gly Leu 

rJLU lyx. »•'- "3 — j — ~ J. ~ 

35 165 170 175 

Lys Arg Leu Leu Arg Thr 
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(2) INFORMATION FOR SEQ ID NO: 91: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 

15 

AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC 



(2) INFORMATION FOR SEQ ID NO: 92: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 
AGATCTGCAG AATTCGATAT CA 



30 



(2) INFORMATION FOR SEQ ID NO: 93: 
{i} SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2304 base pairs 

(B) TYPE: nucleotide 



(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

5 TCCAGCAGCA GGACTGAGGG TGCCCGGGGC AAGTGCCAGC CCATGCCATC 50 

ACCCTCAGAG CCCCGGGTAT GTTTGACCAT TGAGAGCCAG GAAGTTAACT 100 

GTCTCCTGGA CACTGGCGCA GCCTTCTCAG TCTTACTTTC CTGTCCCAGA 150 

CAATTGTCCT CCAGATCTGT CACTATCCGA GGGGTCCTAG GACAGCCAGT 200 

CACTACATAC TTCTCTCAGC CACTAAGTTG TGACTGGGGA ACTTTACTCT 250 

10 TTTCACATGC TTTTCTAATT ATGCCTGAAA GCCCCACTCC CTTGTTAGGG 300 

AGAGACATTT TAGCAAAAGC AGGGGCCATT ATACACCTGA ACATAGGAAA 350 

AGGAATACCC ATTTGCTGTC CCCTGCTTGA GGAAGGAATT AATCCTGAAG 400 

TCTGGGCAAT AGAAGGACAA TATGGACAAG CAAAGAATGC CCGTCCTGTT 450 

CAAGTTAAAC TAAAGGATTC TGCCTCCTTT CCCTACCAAA GGAAGTACCC 500 

15 TCTTAGACCC GAGGCCCTAC AAGGANCTCA AAAGATTGTT AAGGACCTAA 550 

AAGCCCAAGG CCTAGTAAAA CCATGCAGTA GCCCCTGCAA TACTCCAATT 600 

TTAGGAGTAA GGAAACCCAA CGGACAGTGG AGGTTAGTGC AAGATCTCAG 650 

GATTATTAAT GAGGCTGTTT TTCCTCTATA CCCAGCTGTA TCTAGCCCTT 700 

ATACTCTGCT TTCCCTAATA CCAGAGGAAG CAGAGTGGTT TACAGTCCTG 750 

20 GACCTTAAGG ATGCCTTTTT CTGCATCCCT GTACGTCCTG ACTCTCAATT 800 

CTTGTTTGCC TTTGAAGATC CTTTGAACCC AACGTCTCAA CTCACCTGGA 850 

CTGTTTTACC CCAAGGGTTC AGGGATAGCC CCCATCTATT TGGCCAGGCA 900 

TTAGCCCAAG ACTTGAGTCA ATTCTCATAC CTGGACACTC TTGTCCTTCA 950 

GTACGTGGAT GATTTACTTT TAGTCGCCCG TTCAGAAACC TTGTGCCATC 1000 

25 AAGCCACCCA AGAACTCTTA ACTTTCCTCA CTACCTGTGG CTACAAGGTT 1050 

TCCAAACCAA AGGCTCGGCT CTGCTCACAG GAGATTAGAT ACTTAGGGCT 1100 

AAAATTATCC AAAGGCACCA GGGCCCTCAG TGAGGAACGT ATCCAGCCTA 1150 

TACTGGCTTA TCCTCATCCC AAAACCCTAA AGCAACTAAG AGGGTTCCTT 1200 

GGCATAACAG GTTTCTGCCG AAAACAGATT CCCAGGTACA CCCCAATAGC 1250 

30 CAGACCATTA TATACACTAA TTAGGGAAAC TCAGAAAGCC AATACCTATT 1300 

TAGTAAGATG GACACCTACA GAAGTGGCTT TCCAGGCCCT AAAGAAGGCC 1350 

CTAACCCAAG CCCCAGTGTT CAGCTTGCCA ACAGGG CAAG ATTTTTCTTT 1400 

ATATGCCACA GAAAAAACAG GAATAGCTCT AGGAGTCCTT ACGCAGGTCT 1450 

CAGGGATC-AG CTTGCAACCC GTGGTATACC TGAGTAAGGA AATTGATGTA 1500 

35 GTGGCAAAGG GTTGGCCTCA TTGTTTATGG GTAATGGCGG CAGTAGCAGT 1550 

CTTAGTATCT GAAGCAGTTA AAATAATACA GGGAAGAGAT CTTACTGTGT 1600 



175 



15 







'""^CATACTCA 


CTGCTAAAGG 


AGACTTGTGG 


1650 


TTGTCAGACA 


ACCATTTACT 


TAATTATCAG 


GCTCTATTAC 


TTGAAGAGCC 


1700 


AGTGCTGAGA 


CTGCGCACTT 


GTGCAACTCT 


TAAACCCGCC 


ACATTTCTTC 


1750 


CAGACAATGA 


AGAAAAGATA 


GAACATAACT 


GTCAACAAGT 


AATTGCTCAA 


1800 


ACCTATGCTG 


CTCGAGGGGA 


CCTTCTAGAG 


GTTCCCTTGA 


CTGATCCCGA 


1850 


CCTCAACTTG 


TATACTGATG 


GAAGTTCCTT 


GGCAGAAAAA 


GGACTTCGAA 


1900 


AAGCGGGGTA 


TGCAGTGATC 


AGTGATAATG 


GAATACTTGA 


AAGTAATCGC 


1950 


CTCACTCCAG 


GAACTAGTGC 


TCACCTGGCA 


GAACTAATAG 


CCCTCACTTG 


2000 


GGCACTAGAA 


TTAGGAGAAG 


GAAAAAGGGT 


AAATATATAT 


TCAGACTCTA 


2050 


AGTATGCTTA 


CCTAGTCCTC 


CATGCCCATG 


CAGCAATATG 


GAGAGAGAGG 


2100 


GAATTCCTAA 


CTTCTGAGGG 


AACACCTATC 


AACCATCAGG 


AAGCCATTAG 


21S0 


GAGATTATTA 


TTGGCTGTAC 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


2200 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


AAATAGAAGG 


CAATCGCCAA 


2250 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


CATTAGAAAT 


2300 


GCTT 
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(2) INFORMATION FOR SEQ ID NO: 94: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2364 base pairs 
2 0 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


50 


CATCACCCTC 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


100 


GTNACTGTCT 


CCTGGACACT 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


150 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


GTCCGAGGGG 


TCCTAGGACA 


200 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


TGGGGAACTT 


250 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


350 


AGGAGAAGGA 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


400 


CTGAAGTCCG 


GGCAACAGAA 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


450 


CCTGTTCAAG 


TT AA ACT AAA 


GGATTCCACC 


TCCTTTCCCT 


ACCAAAGGCA 


500 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


ATTGTAAAGG 


550 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 
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CCAATTTTAG GAGTAAGG AA 








650 




ACTCAGGATT ATCAATGAGG 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


700 




ACCCTTATAC AGTGCTTTCC 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


750 




GTCCTGGACC TTAAGGATGC 


CTTTTTCTGC 


ATCCCTGTAC 


GTCCTGACTC 


800 


5 


TCAATTCTTG TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


TCTCAACTCA 


850 




CCTGGACTGT TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 




CAGGCATTAG CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


950 




CCTTCAGTAC ATGGATGATT 


TACTTTTAGT 


CGCCCGTTCA 


GAAACCTTGT 


1000 




GCCATCAAGC CACCCAAGAA 


CTCTTAACTT 


TCCTCACTAC 


CTGTGGCTAC 


1050 


10 


AAGGTTTCCA AACCAAAGGC 


TCGGCTCTGC 


TCACAGGAGA 


TTAGATACTN 


1100 




AGGGCTAAAA TTATCCAAAG 


GCACCAGGGC 


CCTCAGTGAG 


GAACGTATCC 


1150 




AGCCTATACT GGCTTATCCT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAGGG 


1200 




TTCCTTGGCA TAACAGGTTT 


CTGCCGAAAA 


CAGATTCCCA 


GGTACASCCC 


1250 




AATAGCCAGA CCATTATATA 


CACTAATTAN 


GGAAACTCAG 


AAAGCCAATA 


1300 


15 


CCTATTTAGT AAGATGGACA 


CCTACAGAAG 


TGGCTTTCCA 


GGCCCTAAAG 


1350 




AAGGCCCTAA CCCAAGCCCC 


AGTGTTCAGC 


TTGCCAACAG 


GGCAAGATTT 


1400 




TTCTTTATAT GCCACAGAAA 


AAACAGGAAT 


AGCTCTAGGA 


GTCCTTACGC 


1450 




AGGTCTCAGG GATGAGCTTG 


CAACCCGTGG 


TATACCTGAG 


TAAGGAAATT 


1500 




GATGTAGTGG CAAAGGGTTG 


GCCTCATNGT 


TTATGGGTAA 


TGGNGGCAGT 


1550 


20 


AGCAGTCTNA GTATCTGAAG 


CAGTTAAAAT 


AATACAGGGA 


AGAGATCTTN 


1600 




CTGTGTGGAC ATCTCATGAT 


GTGAACGGCA 


TACTSRCTGC 


TAAAGGAGAC 


1650 




TTGTGGTTGT CAGACAACCA 


TTTACTTAAN 


TAYCAGGCYY 


TATTACTTGA 


1700 




AGAGCCAGTG CTGNGACTGC 


GCACTTGTCC 


AACTCTTAAA 


CCCAAACTTA 


1750 




TGCTGCCCAG AAGGATCTTT 


NTAGAGGTCC 


CCTTAGCCAA 


CCCTGACCTC 


1800 


25 


AACTATATAT ATACTGATGG 


AAGTTCGTTT 


GTAGAAAAGG 


GATTACAAAG 


1850 




GGNAGGATAT NCCATAGGTG 


TTAGTGATAA 


AGCAGTACTT 


GAAAGTAAGC 


1900 




CTCTTCCCCC CCAGGGACCA 


GCGCCCCCGT 


TAGCAGAACT 


AGTGGCACTG 


1950 




ACCCCGCGAG CCTTAGAACT 


TTGGAAAGGG 


AGGAGGATAA 


ATGTGTATAC 


2000 




AGATAGCAAG TATGCTTATC 


TAATCCGAAA 


TGCCCATGTT 


GCAATATGGA 


2050 


30 


AAGAAAGGGA GTTCCTAACC 


TCTGGGGGAA 


CCCCCATTAA 


ATACCACAAG 


2100 




TTAATCATGG AGTTATTGCA 


CACAGTGCAA 


AAACTCAAGG 


AGGTGGAAGT 


2150 




CTTACACTGC CAAAGCCATC 


AGAAAAGGGA 


AAGAGGGGAA 


GAGCAGCATA 


2200 




AGTGGCTACA GAGGCAAGGA 


AAGACTAGCA 


GAAAGGAAAG 


AGAGAAAGAG 


2250 




«. **-»Ls-ims-^ * r> r» i\ r» is. r* i\ rt 


nGAGGAAGAG 


ACAGAGCACA 


AAGAGGGAGT 


2300 


35 


CAGAGAGAGA GAGAGACAGA 
GAGACAAAGA ATGAH 


GAGTCAGAGA 


GAAGGAAAGA 


GAGAGAGGAA 


2350 
2365 
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(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 
5 (B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



SSSRTEGARG 


KCQPMPSPSE 


PRVCLTIESQ 


EVNCLLDTGA 


AFSVLLSCPR 


50 


QLSSRSVTIR GVLGQPVTTY 


FSQPLSCDWG 


TLLFSHAFLI 


MPESPTPLLG 


100 


RDILAKAGAI 


IHLNIGKGIP 


ICCPLLEEGI 


NPEVWAIEGQ 


YGQAKNARPV 


150 


QVKLKDSASF 


PYQRKYPLRP 


EALQGXQKIV 


KDLKAQGLVK 


PCSSPCNTPI 


200 


LGVRKPNGQW 


RLVQDLRIIN 


EAVFPLYPAV 


SSPYTLLSLI 


PEEAEWFTVL 


250 


DLKDAFFCIP 


VRPDSQFLFA 


FEDPLNPTSQ 


LTWTVLPQGF 


RDSPHLFGQA 


300 


LAQDLSQFSY 


LDTLVLQYVD 


DLLLVARSET 


LCHQATQELL 


TFLTTCGYKV 


350 


SKPKARLCSQ 


EIRYLGLKLS 


KGTRALSEER 


IQPILAYPHP 


KTLKQLRGFL 


400 


GITGFCRKQI 


PRYTPIARPL 


YTLIRETQKA 


NTYLVRWTPT 


EVAFQALKKA 


450 


LTQAPVFSLP 


TGQDFSLYAT 


EKTGIALGVL 


TQVSGMSLQP 


WYLSKEIDV 


500 


VAKGWPHCLW 


VMAAVAVLVS 


EAVKIIQGRD 


LTVWTSHDVN 


GILTAKGDLW 


550 


LSDNHLLNYQ 


ALLLEEPVLR 


LRTCATLKPA 


TFLPDNEEKI 


EHNCQQVIAQ 


600 


TYAARGDLLE 


VPLTDPDLNL 


YTDGSSLAEK 


GLRKAGYAVI 


SDNGILESNR 


650 


LTPGTSAHLA 


ELIALTWALE 


LGEGKRVNIY 


SDSKYAYLVL 


HAHAAIWRER 


700 


EFLTSEGTPI 


NHQEAIRRLL 


LAVQKPKEVA VLHCQGHQEE 


EEREIEGNRQ 


750 


ADIEAKKAAR 


QDSPLEKL 
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25 (2) INFORMATION FOR SEQ ID NO: 96: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

30 

SSSRTEGARG KCQPMPSPSE PRVCLTIESQ EVNCLLDTGA AFSVLLSCPR 
QLSSRSVTIR GVLGQPVTTY FSQPLSCDWG TLLFSHAFLI MPESPTPLLG 
RDILAKAGAI IHLN 



35 



(2) INFORMATION FOR SEQ ID NO: 97: 
(i) SEQUENCE CHARACTERISTICS: 



178 

(A) LENGTH i aininc »cids 

(B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 



5 IGKGIPI CCPLLEEG I NPEVWAI EGQ YGQ AKN ARP V 

QVKLKDSASFPYQRKYPLRPEALQGXQKIVKDLKAQGLVKPCSSPCNTPI 

LGVRKPNGQWRLVQDLRI INE AVFPLY P AVS S P YTLLSLI PEE AE WFTVL 
DLKDAFFCIPVRPDSQFLFAFEDPLNPTSQLTWTVLPQGFRDSPHLFGQA 
LAQDLSQFSYLDTLVLQYVDDLLLVARSETLCHQATQELLTFLTTCGYKV 

10 SKPKARLCSQEIRYLGLKLSKGTRALSEERIQPILAYPHPKTLKQLRGFL 
GITGFCRKQIPRYTPIARPLYTLIRETQKANTYLVRWTPTEVAFQALKKA 
LTQAPVFSLPTGQDFSLYATEKTGIALGVLTQVSGMSLQPVVYLSKEIDV 
VAKGWPHCLWVMAAVAVLVSEAVKI I QGRDLTVWTSHDVNGILTAKGDLW 
LSDNHLLNYQALLLEEPVLRLRTCATLKPATFLPDNEEKIEHNCQQVIAQ 

15 TYAARGDLLEVPLTDPDLNLYTDGSSLAEKGLRKAGYAVISDNGILESNR 
LTPGTSAHLAELIALTWALELGEGKRVNIYSDSKYAYLVLHAHAAIWRER 
EFLTSEGTPINHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEEREIEGNRQ 

AD I E AKKAARQD S PLEML 



20 

(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: amino acids 

(B) TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

LYTDGSSLAEKGLRKAGYAVISDNGILESNR 

LTPGTSAHLAELIALTWALELGEGKRVNIYSDSKYAYLVLHAHAAIWRER 
EFLTSEGTP INHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEERE I EGNRQ 
30 ADIEAKKAARQDSPLEML 

(2) INFORMATION FOR SEQ ID NO: 99 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH 2 23 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 



WO 98/23755 
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(D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
AGGAGTAAGG AAACCCAACG GAC 23 



5 (2) INFORMATION FOR SEQ ID NO: 100 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
TAAGAGTTGC ACAAGTGCG 19 

(2) INFORMATION FOR SEQ ID NO: 101 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TCAGGGATAG CCCCCATCTA T 21 

(2) INFORMATION FOR SEQ ID NO: 102 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
30 AACCCTTTGC CACTACATCA ATTT 

(2) INFORMATION FOR SEQ ID NO: 103 
(i) SEQUENCE CHARACTERISTICS: 

i * \ t pmptu • in haco nai rR 

\ « / J M »iJ**W*.*4 - ' ' £ 

35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 



24 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
AGCAGCAGGA CTGAGGGT 



5 (2) INFORMATION FOR SEQ ID NO: 104 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

<C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTGTCCGTTG GGTTTCCTTA CTCCT 

(2) INFORMATION FOR SEQ ID NO: 105 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GACAGCAAAT GGGTATTCCT TTCC 

(2) INFORMATION FOR SEQ ID NO: 106 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106 
30 AGGAGTAAGG AAACCCAACG GACA 

(2) INFORMATION FOR SEQ ID NO: 107 
(i) SEQUENCE CHARACTERISTICS: 



. ■» * ▼ TT>vr/-*rrilJ . OC haeo nairfl 
sr 



35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(jj) TOFOLCGi i AiiiSar 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
TGTATATAAT GGTCTGGCTA TTGGG 25 

5 (2) INFORMATION FOR SEQ ID NO: 108 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

( B ) TYPE : nuc leot ide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
TTCGGCAGAA ACCTGTTATG CCAAGG 26 

(2) INFORMATION FOR SEQ ID NO: 109 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

GGCTCTGCTC ACAGGAGATT AGATAC 26 

(2) INFORMATION FOR SEQ ID NO: 110 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
30 AAAGGCACCA GGGCCCTCAG TGAGGA 26 

(2) INFORMATION FOR SEQ ID NO: 111 

(i) SEQUENCE CHARACTERISTICS: 

/a\ tpmrth: base oairs 
\ / 

35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 




^w; 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
GGTTTAAGAG TTGCACAAGT GCGCAGTC 28 

(2) INFORMATION FOR SEQ ID NO: 112: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

GCTTATAGAA GGACCCCTAG TATGGGGTAA TCCCCTCTGG GAAACCAAGC CCCAGTACTC 60 

AGCAGGAAAA ATAGAATAGG AAACCTCACA AGGACATACT TTCCTCCCCT CCAGATGGCT 120 

AGCCACTGAG GAAGGAAAAA TACTTTCACC TGCAGCTAAC CAACAGAAAT TACTTAAAAC 180 

CCTTCACCAA ACCTTCCACT TAGGCATTGA TAGCACCCAT CAGATGGCCA AATTATTATT 240 

TACTGGACCA GGCCTTTTCA AAACTATCAA GAAGATAGTC AGGGGCTGTG AAGTGTGCCA 300 

AAGAAATAAT 310 



(2) INFORMATION FOR SEQ ID NO: 113: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Leu lie Glu Gly Pro Leu Val Trp Gly Asn Pro Leu Trp Glu Thr Lys 

15 10 15 

Pro Gin Tyr Ser Ala Gly Lys lie Glu Xaa Glu Thr Ser Gin Gly His 

20 25 30 

Thr Phe Leu Pro Ser Arg Trp Leu Ala Thr Glu Glu Gly Lys He Leu 

35 40 45 

Ser Pro Ala Ala Asn Gin Gin Lys Leu Leu Lys Thr Leu His Gin Thr 

50 55 60 

Phe His Leu Gly He Asp Ser Thr His Gin Met Ala Lys Leu Leu Phe 
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Thr Gly Pro Gly Leu Phe Lys Thr He Lys Lys He Val Arg Gly Cys 
85 90 95 



Glu Val Cys Gin Arg Asn Asn 



5 



100 



(2) INFORMATION FOR SEQ ID NO: 114: 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 635 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) TYPE DE MOLECULE: ADNc 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 



15 CCCTGTATCT TTAACCTCCT TGTTAAGTTT GTCTCTTCCA GAATCAAAAC TGTAAAACTA 
CAAATTGTTC TTCAAATGGA GCACCAGATG GAGTCCATGA CTAAGATCCA CCGTGGACCC 
CTGGACCGGC CTGCTAGCCC ATGCTCCGAT GTTAATGACA TTGAAGGCAC CCCTCCCGAG 
GAAATCTCAA CTGCACAACC CCTACTATGC CCCAATTCAG CGGGAAGCAG TTAGAGCGGT 
CATCAGCCAA CCTCCCCAAC AGCACTTGGG TTTTCCTGTT GAGAGGGGGG ACTGAGAGAC 

20 AGGACTAGCT GGATTTCCTA GGCCAACGAA GAATCCCTAA GCCTAGCTGG GAAGGTGACT 
GCATCCACCT CTAAACATGG GGCTTGCAAC TTAGCTCACA CCCGACCAAT CAGAGAGCTC 
ACTAAAATGC TAATTAGGCA AAAATAGGAG GTAAAGAAAT AGCCAATCAT CTATTGCCTG 
AGAGCACAGC GGGAGGGACA AGGATCGGGA TATAAACCCA GGCATTCGAG CCGGCAACGG 
CAACCCCCTT TGGGTCCCCT CCCTTTGTAT GGGCGCTCTG TTTTCACTCT ATTTCACTCT 

25 ATTAAATCTT GCAACTGAAA AAAAAAAAAA AAAAA 

(2) INFORMATION FOR SEQ ID NO: 115: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 
30 (B) TYPE: amino acid 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
635 



(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
t „\\ cpnnPNr.1? INSCRIPTION: SEO ID NO: 115: 

— -* w — 



35 



Pro Cys He Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg He Lys 
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, t t — . t i c uai T im m n Mot my Hig Gin Met Glu Ser 

X I1X Vtti. iJjfO «c« * — » *• — ' 

20 25 30 

Met Thr Lys lie His Arg Gly Pro Leu Asp Arg Pro Ala Ser Pro Cys 

35 40 45 

Ser Asp Val Asn Asp lie Glu Gly Thr Pro Pro Glu Glu lie Ser Thr 

50 55 60 

Ala Gin Pro Leu Leu Cys Pro Asn Ser Ala Gly Ser Ser 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 116: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TGGGGTTCCA TTTGTAAGAC CATCTGTAGC TT 



(2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1481 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
ATGGCCCTCC CTTATCATAC TTTTCTCTTT ACTGTTCTCT TACCCCCTTT CGCTCTCACT 60 
GCACCCCCTC CATGCTGCTG TACAACCAGT AGCTCCCCTT ACCAAGAGTT TCTATGAAGA 120 
ACGCGGCTTC CTGGAAATAT TGATGCCCCA TCATATAGGA GTTTATCTAA GGGAAACTCC 180 
ACCTTCACTG CCCACACCCA TATGCCCCGC AACTGCTATA ACTCTGCCAC TCTTTGCATG 240 
CATGCAAATA CTCATTATTG GACAGGGAAA ATGATTAATC CTAGTTGTCC TGGAGGACTT 300 
GGAGCCACTG TCTGTTGGAC TTACTTCACC CATACCAGTA TGTCTGATGG GGGTGGAATT 360 
CAAGGTCAGG CAAGAGAAAA ACAAGTAAAG GAAGCAATCT CCCAACTGAC CCGGGGACAT 420 
AGCACCCCTA GCCCCTACAA AGGACTAGTT CTCTCAAAAC TACATGAAAC CCTCCGTACC 480 
CATACTCGCC TGGTGAGCCT ATTTAATACC ACCCTCACTC GGCTCCATGA GGTCTCAGCC 540 



WO 98/23755 

CAAAACCCTA CTAACTGTTG GATGTGCCTC 
CCTGTTCCTG AACAATGGAA CAACTTCAGC 
GGACCTCTTG TTTCCAATCT GGAAATAACC 
AGCAATACTA TAGACACAAC CAGCTCCCAA 
5 ATAGTCTGCC TACCCTCAGG AATATTTTTT 
AATGGCTCTT CAGAATCTAT GTGCTTCCTC 
ACTGAACAAG ATTTATACAA TCATGTCGTA 
CTTCCTTTTG TTATCAGAGC AGGAGTGCTA 
ACAACCTCTA CTCAGTTCTA CTACAAACTA 
10 GTCACTGACT CCCTGGTCAC CTTGCAAGAT 
CAAAATCGAA GAGCTTTAGA CTTGCTAACC 
GGAGAAGAAC GCTGTTATTA TGTTAATCAA 
ATTCGAGATC GAATACAATG TAGAGCAGAG 
CTCAGCCAAT GGATGCCCTG GGTTCTCCCC 
15 TTACTCCTCT TTGGACCCTG TATCTTTAAC 
GAAGCTGTAA AGCTACAGAT GGTCTTACAA 



m 
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CuCuiGCACT 


J. v*ftW\7V"Wn a n 


k mmm^it nmn 

Vn Ail N/fUlA V 


600 


ACAGAAATAA 


ACACCACTTC 


CGTTTTAGTA 


660 


CATACCTCAA 


ACCTCACCTG 


TGTAAAATTT 


720 


TGCATCAGGT 


GGGTAACACC 


TCCCACACGA 


780 


GTCTGTGGTA 


CCTCAGCCTA 


TCATTGTTTG 


840 


TCATTCTTAG 


TGCCCCCTAT 


GACCATCTAC 


900 


CCTAAGCCGC 


ACAACAAAAG 


AGTACCCATT 


960 


GGCAGACTAG 


GTACTGGCAT 


TGGCAGTATC 


1020 


TCTCAAGAAA 


TAAATGGTGA 


CATGGAACAG 


1080 


CAACTTAACT 


CCCTAGCAGC 


AGTAGTCCTT 


1140 


GCCAAAAGAG 


GGGGAACCTG 


TTTATTTTTA 


1200 


TCCAGAATTG 


TCACTGAGAA 


AGTTAAAGAA 


1260 


GAGCTTCAAA 


ACACCGAACG 


CTGGGGCCTC 


1320 


TTCTTAGGAC 


CTCTAGCAGC 


TCTAATATTG 


1380 


CTCCTTGTTA 


AGTTTGTCTC 


TTCCAGAATT 


1440 


ATGGAACCCC 


A 




1481 



(2) INFORMATION FOR SEQ ID NO: 118: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 493 amino acids 

{ B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

25 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Ala Leu Pro Tyr His Thr Phe Leu Phe Thr Val Leu Leu Pro Pro 
15 10 " 

Phe Ala Leu Thr Ala Pro Pro Pro Cys Cys Cys Thr Thr Ser Ser Ser 
20 25 30 

30 Pro Tyr Gin Glu Phe Leu Xaa Arg Thr Arg Leu Pro Gly Asn lie Asp 

35 40 45 

Ala Pro Ser Tyr Arg Ser Leu Ser Lys Gly Asn Ser Thr Phe Thr Ala 
50 55 60 

_ t"» — * v-~ nva Tvr Asn Ser Ala Thr Leu Cvs Met 

tllS XII at nxa ric:u aaw *^ ^ • - — - -j- 

35 65 70 75 80 

His Ala Asn Thr His Tyr Trp Thr Gly Lys Met He Asn Pro Ser Cys 
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on Qt; 

o o 

Pro Gly Gly Leu Gly Ala Thr Val CyB Trp Thr Tyr Phe Thr His Thr 

100 105 HO 

Ser Met Ser Asp Gly Gly Gly lie Gin Gly Gin Ala Arg Glu Lye Gin 

115 120 125 

Val Lys Glu Ala lie Ser Gin Leu Thr Arg Gly His Ser Thr Pro Ser 

130 135 140 

Pro Tyr Lys Gly Leu Val Leu Ser Lys Leu His Glu Thr Leu Arg Thr 
145 150 155 160 

His Thr Arg Leu Val ser Leu Phe Asn Thr Thr Leu Thr Arg Leu His 

165 170 175 

Glu Val Ser Ala Gin Asn Pro Thr Asn Cys Trp Met Cys Leu Pro Leu 

180 185 190 

His Phe Arg Pro Tyr He Ser He Pro Val Pro Glu Gin Trp Asn Asn 

195 200 205 

Phe Sef> Thr Glu He Asn Thr Thr Ser Val Leu Val Gly Pro Leu Val 

210 215 220 

ser Asn Leu Glu He Thr His Thr Ser Asn Leu Thr Cys Val Lys Phe 
225 230 235 240 

Ser Asn Thr He Asp Thr Thr Ser Ser Gin Cys He Arg Trp Val Thr 

245 250 255 

Pro Pro Thr Arg lie val cys Leu Pro Ser Gly He Phe Phe Val Cys 

260 265 270 

Gly Thr Ser Ala Tyr His Cys Leu Asn Gly Ser Ser Glu Ser Met Cys 

275 280 285 

Phe Leu Ser Phe Leu Val Pro Pro Met Thr He Tyr Thr Glu Gin Asp 

290 295 300 

Leu Tyr Asn His Val Val Pro Lys Pro His Asn Lys Arg Val Pro He 
305 310 315 320 

Leu Pro Phe Val He Arg Ala Gly Val Leu Gly Arg Leu Gly Thr Gly 

325 330 335 

He Gly Ser He Thr Thr Ser Thr Gin Phe Tyr Tyr Lys Leu Ser Gin 
340 345 350 

„i„ *=t, M»<- filu Gin Val Thr Asp Ser Leu Val Thr Leu 

Vjiu lie nan 7 * -—r- 

355 360 365 

Gin Asp Gin Leu Asn Ser Leu Ala Ala Val Val Leu Gin Asn Arg Arg 
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15 
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Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr Cys Leu Phe Leu 
385 390 395 400 

Gly Glu Glu Arg Cys Tyr Tyr Val Asn Gin Ser Arg He Val Thr Glu 

405 410 415 

Lys Val Lys Glu He Arg Asp Arg He Gin Cys Arg Ala Glu Glu Leu 

420 425 430 

Gin Asn Thr Glu Arg Trp Gly Leu Leu Ser Gin Trp Met Pro Trp Val 

435 440 445 

Leu Pro Phe Leu Gly Pro Leu Ala Ala Leu He Leu Leu Leu Leu Phe 

450 455 460 

Gly Pro Cys He Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg He 
465 470 475 480 

Glu Ala Val Lys Leu Gin Met Val Leu Gin Met Glu Pro 
485 490 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
25 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CG 32 

(2) INFORMATION FOR SEQ ID NO: 120: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1329 base pairs 
30 ( B ) TYPE : nuc 1 eot ide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

...z x cTPrtTitrMr-w np.QrRTPTION: SEO ID NO: 120: 

\ A ±. f — — 

35 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CGCCAAAAGA GGGGGAACCT GTTTATTTTT 60 
AGGGGAAGAA TGCTGTTAGT ATGTTAATCA ATCTGGAATC ATTACTGAGA AAGTTAAAGA 120 
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AATTTGAGAT 






GG ACCTTCAA 


AACACTGCAC 


CCTGGGGCCT 


180 




CCTCAGCCAA 


TGGATGCCCT 


GGACTCTCCC 


CTTCTTAGGA 


CCTCTAGCAG 


CTATAATATT 


240 




TTTACTCCTC 


TTTGGACCCT 


GTATCTTCAA 


CTTCCTTGTT 


AAGTTTGTCT 


CTTCCAGAAT 


300 




TGAAGCTGTA 


AAGCTACAAA 


TAGTTCTTCA 


AATGGAACCC 


CAGATGCAGT 


CCATGACTAA 


360 


5 


AATCTACCGT 


GGACCCCTGG 


ACCGGCCTGC 


TAGACTATGC 


TCTGATGTTA 


ATGACATTGA 


420 




AGTCACCCCT 


CCCGAGGAAA 


TCTCAACTGC 


ACAACCCCTA 


CTACACTCCA 


ATTCAGTAGG 


480 




AAGCAGTTAG 


AGCAGTTGTC 


AGCCAACCTC 


CCCAACAGTA 


CTTGGGTTTT 


CCTGTTGAGA 


540 




GGGTGGACTG 


AGAGACAGGA 


CTAGCTGGAT 


TTCCTAGGCT 


GACTAAGAAT 


CCCNAAGCCT 


600 




ANCTGGGAAG 


GTGACCGCAT 


CCATCTTTAA 


ACATGGGGCT 


TGCAACTTAG 


CTCACACCCG 


660 


10 


ACCAATCAGA 


GAGCTCACTA 


AAATGCTAAT 


CAGGCAAAAA 


CAGGAGGTAA 


AGCAATAGCC 


720 




AATCATCTAT 


TGCCTGAGAG 


CACAGCGGGA 


AGGACAAGGA 


TTGGGATAlfA 


AACTCAGGCA 


780 




TTCAAGCCAG 


CAACAGCAAC 


CCCCTTTGGG 


TCCCCTCCCA 


TTGTATGGGA 


GCTCTGTTTT 


840 




CACTCTATTT 


CACTCTATTA 


AATCATGCAA 


CTGCACTCTT 


CTGGTCCGTG 


TTTTTTATGG 


900 




CTCAAGCTGA 


GCTTTTGTTC 


GCCATCCACC 


ACTGCTGTTT 


GCCACCGTCA 


CAGACCCGCT 


960 


15 


GCTGACTTCC 


ATCCCTTTGG 


ATCCAGCAGA 


GTGTCCACTG 


TGCTCCTGAT 


CCAGCGAGGT 


1020 




ACCCATTGCC 


ACTCCCGATC 


AGGCTAAAGG 


CTTGCCATTG 


TTCCTGCATG 


GCTAAGTGCC 


1080 




TGGGTTTGTC 


CTAATAGAAC 


TGAACACTGG 


TCACTGGGTT 


CCATGGTTCT 


CTTCCATGAC 


1140 




CCACGGCTTC 


TAATAGAGCT 


ATAACACTCA 


CCGCATGGCC 


CAAGATTCCA 


TTCCTTGGTA 


1200 




TCTGTGAGGC 


CAAGAACCCC 


AGGTCAGAGA 


ANGTGAGGCT 


TGCCACCATT 


TGGGAAGTGG 


1260 


20 


CCCACTGCCA 
CCAGTAACA 


TTTTGGTAGC 


GGCCCACCAC 


CATCTTGGGA 


GCTGTGGGAG 


CAAGGATCCC 


1320 
1329 



(2) INFORMATION FOR SEQ ID NO: 121: 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Gin Asn Arg Arg Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr 
X 5 10 15 

Cys Leu Phe Leu Gly Glu Glu Cys Cys Xaa Tyr Val Asn Gin Ser Gly 
20 25 30 

35 He He Thr Glu Lys Val Lys Glu He Xaa Asp Arg He Xaa Cys Arg 

35 40 45 
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50 55 60 

Met Pro Trp Thr Leu Pro Phe Leu Gly Pro Leu Ala Ala lie lie Phe 
65 70 75 80 

Leu Leu Leu Phe Gly Pro Cys He Phe Asn Phe Leu Val Lys Phe Val 

85 90 95 

Ser Ser Arg He Glu Ala Val Lys Leu Gin He Val Leu Gin Met Glu 

100 105 HO 

Pro Gin Met Gin Ser Met Thr Lys He Tyr Arg Gly Pro Leu Asp Arg 

115 120 125 

Pro Ala Arg Leu Cys Ser Asp Val Asp Asp He Glu Val Thr Pro Pro 

130 135 140 

Glu Glu He Ser Thr Ala Gin Pro Leu Leu His Ser Asn Ser Val Gly 
145 150 155 160 

Ser ser 

(2) INFORMATION FOR SEQ ID NO: 122: 
(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 21 base pairs 
<B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
GGCATTGATA GCACCCATCA G 2 

(2) INFORMATION FOR SEQ ID NO: 123: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
CATGTCACCA GGGTGGAATA G 5 



m 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 758 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 
GGCATTGATA GCACCCATCA GATGGCCAAA TCATTATTTA CTGGACCAGG CCTTTTCAAA 60 
10 ACTATCAAGC AGATAGGGCC CGTGAAGCAT GCCAAAGAAA TAATCCCCTG CCTTATCGCC 120 
ATGTTCCTTC AGGAGAACAA AGAACAGGCC ATTACCCAGG GGAAGACTGG CAACTAGATT 180 
TTACCCACAT GGCCAAATGT CAGGGATTTC AGCATCTACT AGTCTGGGCA GATACTTTCA 240 
CTGGTTGGGT GGAGTCTTCT CCTTGTAGGA CAGAAAAGAC CCAAGAGGTA ATAAAGGCAC 300 
TAATGAAATA ATTCCCAGAT TTGGACTTCC CCCAGGATTA CAGGGTGACA ATGGCCCCGC 360 
15 TTTCAAGGCT GCAGTAACCC AGGGAGTATC CCAGGTGTTA GGCATACAAT ATCACTTACA 420 
CTGTGCCTGG AGGCCACAAT CCTCCAGAAA AGTCAAGAAA ATGAATGAAA CACTCAAAGA 480 
TCTAAAAAAG CTAACCCAAG AAACCCACAT TGCATGACCT GTTCTGTTGC CTATAACCTT 540 
ACTAAGAATC CATAACTATC CCCCAAAAAG CAGGACTTAG CCCATACGAG ATGCTATATG 600 
GATGGCCTTT CCTAACCAAT GACCTTGTGC TTGACTGAGA AATGGCCAAC TTAGTTGCAG 660 
20 ACATCACCTC CTTAGCCAAA TATCAACAAG TTCTTAAAAC ATCACAGGGA ACCTGTCCCC 720 
GAGAGGAGGG AAAGGAACTA TTCCACCCTG GTGACATG 758 



(2) INFORMATION FOR SEQ ID NO: 126: 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
30 (ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 
CGGACATCCA AAGTGATGGG AAACG 



#o% TMprtoMHTtriM FfYR SEO ID NO: 127: 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

GGACAGGAAA GTAAGACTGA GAAGGC 26 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

CCTAGAACGT ATTCTGGAGA ATTGGG 

(2) INFORMATION FOR SEQ ID NO: 129: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

TGGCTCTCAA TGGTCAAACA TACCCG 



(2) INFORMATION FOR SEQ ID NO: 130: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1511 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

CCTAGAACGT ATTCTGGAGA ATTGGGACCA ATGTGACACT CAGACGCTAA GAAAGAAACG 60 
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GCTTCCTGAG GGAAGTATAA ATTATAACAT CATCTTACAG CTAGACCTCT TCTGTAGAAA 180 

GGAGGGCAAA TGGAGTGAAG TGCCATATGT GCAAACTTTC TTTTCATTAA GAGACAACTC 240 

ACAATTATGT AAAAAGTGTG GTTTATGCCC TACAGGAAGC CCTCAGAGTC CACCTCCCTA 300 

5 CCCCAGCGTC CCCTCCCCGA CTCCTTCCTC AACTAATAAG GACCCCCCTT TAACCCAAAC 360 

GGTCCAAAAG GAGATAGACA AAGGGGTAAA CAATGAACCA AAGAGTGCCA ATATTCCCCG 420 

ATTATGCCCC CTCCAAGCAG TGAGAGGAGG AGAATTCGGC CCAGCCAGAG TGCCTGTACC 480 

TTTTTCTCTC TCAGACTTAA AGCAAATTAA AATAGACCTA GGTAAATTCT CAGATAACCC 540 

TGACGGCTAT ATTGATGTTT TACAAGGGTT AGGACAATCC TTTGATCTGA CATGGAGAGA 600 

10 TATAATGTTA CTACTAAATC AGACACTAAC CCCAAATGAG AGAAGTGCCG CTGTAACTGC 660 

AGCCCGAGAG TTTGGCGATC TTTGGTATCT CAGTCAGGCC AACAATAGGA TGACAACAGA 720 

GGAAAGAACA ACTCCCACAG GCCAGCAGGC AGTTCCCAGT GTAGACCCTC ATTGGGACAC 780 

AGAATCAGAA CATGGAGATT GGTGCCACAA ACATTTGCTA ACTTGCGTGC TAGAAGGACT 840 

GAGGAAAACT AGGAAGAAGC CTATGAATTA CTCAATGATG TCCACTATAA CACAGGGAAA 900 

15 GGAAGAAAAT CTTACTGCTT TTCTGGACAG ACTAAGGGAG GCATTGAGGA AGCATACCTC 960 

CCTGTCACCT GACTCTATTG AAGGCCAACT AATCTTAAAG GATAAGTTTA TCACTCAGTC 1020 

AGCTGCAGAC ATTAGAAAAA ACTTCAAAAG TCTGCCTTAG GCCCGGAGCA GAACTTAGAA 1080 

ACCCTATTTA ACTTGGCATC CTCAGTTTTT TATAATAGAG ATCAGGAGGA GCAGGCGAAA 1140 

CGGGACAAAC GGGATAAAAA AAAAAGGGGG GGTCCACTAC TTTAGTCATG GCCCTCAGGC 1200 

20 AAGCAGACTT TGGAGGCTCT GCAAAAGGGA AAAGCTGGGC AAATCAAATG CCTAATAGGG 1260 

CTGGCTTCCA GTGCGGTCTA CAAGGACACT TTAAAAAAGA TTATCCAAGT AGAAATAAGC 1320 

CGCCCCCTTG TCCATGCCCC TTACGTCAAG GGAATCACTG GAAGGCCCAC TGCCCCAGGG 1380 

GATGAAGATA CTCTGAGTCA GAAGCCATTA ACCAGATGAT CCAGCAGCAG GACTGAGGGT 1440 

GCCCGGGGCG AGCGCCAGCC CATGCCATCA CCCTCACAGA GCCCCGGGTA TGTTTGACCA 1500 

25 TTGAGAGCCA A 1511 

(2) INFORMATION FOR SEQ ID NO: 131: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 
30 (B) TYPE: amino acid 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

x cT^TTi?xTr«T? nwcrPTPTTON* SEO ID NO: 131 8 

\ AA ; - 



Leu Glu Arg lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu 



1 



5 



10 



15 
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- * r>w^ ti o tdh« dko pvb ser Thr Ala TrD Pro Gin Tyr 

Arg; L»ye J-»y » rw-y rw« * - — 

20 25 30 

Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser He Asn Tyr 

35 40 45 

Asn He He Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp 

50 55 60 

Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser 
65 70 75 80 

Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser 

85 90 95 

Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn 

100 1° 5 110 

Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu He Asp Lys Gly 

H5 120 125 

Val Asn Asn Glu Pro Lys Ser Ala Asn He Pro Arg Leu Cys Pro Leu 

130 135 140 

Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro 
145 150 155 160 

Phe Ser Leu Ser Asp Leu Lys Gin He Lys He Asp Leu Gly Lys Phe 

165 170 175 

Ser Asp Asn Pro Asp Gly Tyr He Asp Val Leu Gin Gly Leu Gly Gin 

180 185 19° 

ser Phe Asp Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin Thr 

195 200 205 

Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe 

210 215 220 

Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu 
225 230 235 240 

Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro 

245 250 255 

His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu 

260 265 270 

Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met 

275 280 285 

Asn Tyr Ser Met Met Ser Thr He Thr Gin Gly Lys Glu Glu Asn Leu 
290 



295 300 
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t r.i„ tvI a T on A-rn T.vfl His Thr Ser 

'in! Ala me IjCU no^s nj.*j w-— » — * - 

305 310 315 320 

Leu Ser Pro Asp Ser He Glu Gly Gin Leu He Leu Lys Asp Lys Phe 

325 330 335 

lie Thr Gin Ser Ala Ala Asp He Arg Lys Asn Phe Lys Ser Leu Pro 

340 345 350 



(2) INFORMATION FOR SEQ ID NO: 132: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

TGCTGGAATT CGGGATCCTA GAACGTATTC 30 



(2) INFORMATION FOR SEQ ID NO: 133: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 30 base pairs 

( B ) TYPE : nuc leot ide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

AGTTCTGCTC CGAAGCTTAG GCAGACTTTT 



(2) INFORMATION FOR SEQ ID NO: 135: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 398 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Tvoi? ni? MOT.RrULE: Deotide 
*-* — — : _ * ~ 

35 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
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1 5 10 15 

Arg Gly Ser His Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg 

20 25 30 

lie Leu Glu Arg He Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr 

35 40 45 

Leu Arg Lys Lys Arg Phe He Phe Phe Cys Ser Thr Ala Trp Pro Gin 

50 55 60 

Tyr Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser He Asn 
65 70 75 80 

Tyr Asn He He Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys 

85 90 95 

Trp Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn 

100 105 HO 

Ser Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin 

115 120 125 

Ser Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr 

130 135 140 

Asn Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu He Asp Lys 
145 150 155 160 

Gly Val Asn Asn Glu Pro Lys Ser Ala Asn He Pro Arg Leu Cys Pro 

165 170 175 

Leu Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val 

180 185 190 

Pro Phe Ser Leu Ser Asp Leu Lys Gin He Lys He Asp Leu Gly Lys 

195 200 205 

Phe Ser Asp Asn Pro Asp Gly Tyr He Asp Val Leu Gin Gly Leu Gly 

210 215 220 

Gin Ser Phe Asp Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin 
225 230 235 240 

Thr Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu 

245 250 255 

Phe Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr 

260 265 270 

Glu Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp 

275 280 285 

Pro His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His 



196 



2vu 

Leu Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro 
305 310 315 320 

Met Asn Tyr Ser Met Met Ser Thr He Thr Gin Gly Lys Glu Glu Asn 
5 325 330 335 

Leu Thr Ala Phe Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr 

340 345 350 

Ser Leu Ser Pro Asp Ser He Glu Gly Gin Leu He Leu Lys Asp Lys 
355 360 365 

10 Phe He Thr Gin Ser Ala Ala Asp He Arg Lys Asn Phe Lys Ser Leu 

370 375 380 

Pro Lys Leu Ala Ala Ala Leu Glu His His His His His His 
385 390 395 



15 (2) INFORMATION FOR SEQ ID NO: 137: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 
20 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 
(xi)^SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg He Leu Glu Arg 
I 5 10 15 

25 He Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu Arg Lys Lys 

20 25 30 

Arg Phe He Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr Pro Leu Gin 

35 40 45 

Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser He Asn Tyr Asn He He 
30 50 55 60 

Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp Ser Glu Val 
65 70 75 80 

Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser Gin Leu Cys 
85 90 95 

35 Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser Pro Pro Pro 

100 105 HO 
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ai Pre Ser Pro Thr Pro Ser Ser Thr Asn Lys Asp Pro 



115 120 125 

Pro Leu Thr Gin Thr Val Gin Lys Glu He Asp Lys Gly Val Asn Asn 

130 135 140 

Glu Pro Lys Ser Ala Asn He Pro Arg Leu Cys Pro Leu Gin Ala Val 
145 150 155 160 

Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro Phe Ser Leu 

165 170 175 

Ser Asp Leu Lys Gin He Lys He Asp Leu Gly Lys Phe Ser Asp Asn 

180 185 190 

Pro Asp Gly Tyr He Asp Val Leu Gin Gly Leu Gly Gin Ser Phe Asp 

195 200 205 

Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin Thr Leu Thr Pro 

210 215 220 

Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe Gly Asp Leu 
225 230 235 240 

Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu Glu Arg Thr 

245 250 255 

Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro His Trp Asp 

260 265 270 

Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu Leu Thr Cys 

275 280 2T85 

Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met Asn Tyr Ser 

290 295 300 

Met Met Ser Thr He Thr Gin Gly Lys Glu Glu Asn Leu Thr Ala Phe 
305 310 315 320 

Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr Ser Leu Ser Pro 

325 330 335 

Asp Ser He Glu Gly Gin Leu He Leu Lys Asp Lys Phe He Thr Gin 

340 345 350 

Ser Ala Ala Asp He Arg Lys Asn Phe Lys Ser Leu Pro Lys Leu Ala 

355 360 365 

Ala Ala Leu Glu His His His His His His 
V7H 375 



(2) INFORMATION FOR SEQ ID NO: 138: 
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(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
CTTGGAGGGT GCATAACCAG GGAAT 



10 (2) INFORMATION FOR SEQ ID NO: 139: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139 
TGTCCGCTGT GCTCCTGATC 



20 (2) INFORMATION FOR SEQ ID NO: 140: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140 
CTATGTCCTT TTGGACTGTT TGGGT 



30 (2) INFORMATION FOR SEQ ID NO: 141: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 764 base pairs 

(B) TYPE: nucleotide 

tn\ croivNnirnNESSs sinale 
v ^ / 

35 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
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TGTCCGCTGT GCTCCTGATC CAGCACAGGC GCCCATTGCC TCTCCCAATT GGGCTAAAGG 60 

CTTGCCATTG TTCCTGCACA GCTAAGTGCC TGGGTTCATC CTAATCGAGC TGAACACTAG 120 

TCACTGGGTT CCACGGTTCT CTTCCATGAC CCATGGCTTC TAATAGAGCT ATAACACTCA 180 

5 CTGCATGGTC CAAGATTCCA TTCCTTGGAA TCCGTGAGAC CAAGAACCCC AGGTCAGAGA 240 

ACACAAGGCT TGCCACCATG TTGGAAGCAG CCCACCACCA TTTTGGAAGC AGCCCGCCAC 300 

TATCTTGGGA GCTCTGGGAG CAAGGACCCC AGGTAACAAT TTGGTGACCA CGAAGGGACC 360 

TGAATCCGCA ACCATGAAGG GATCTCCAAA GCAATTGGAA ATGTTCCTCC CAAGGCAAAA 420 

ATGCCCCTAA GATGTATTCT GGAGAATTGG GACCAATTTG ACCCTCAGAC AGTAAGAAAA 480 

10 AAATGACTTA TATTCTTCTG CAGTACCGCC CTGGCCACGA TATCCTCTTC AAGGGGGAGA 540 

AACCTGGCCT CCTGAGGGAA GTATAAATTA TAACACCATC TTACAGCTAG ACCTGTTTTG 600 

TAGAAAAGGA GGCAAATGGA GTGAAGTGCC ATATTTACAA ACTTTCTTTT CATTAAAAGA 660 

CAACTCGCAA TTATGTTAAC AGTGTGATTT GTGTTCCTAC ACGGAAGCCC TCAGATTCTA 720 

CTCCCCACCC CCGGCATCTC CCCTGAATCC CTCCCCAACT TATT 764 

15 

(2) INFORMATION FOR SEQ ID NO: 142: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 800 base pairs 

(B) TYPE: nucleotide 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE V DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

TGTCCGCTGT GCTCCTGATC CAGCACAGGC GCCCATTGCC TCTCCCAATT GGGCTAAAGG 60 

25 CTTGCCATTG TTCCTGCACA GCTAAGTGCC TGGGTTCATC CTAATCGAGC TGAACACTAG 120 

TCACTGGGTT CCACGGTTCT CTTCCATGAC CCATGGCTTC TAATAGAGCT ATAACACTCA 180 

CTGCATGGTC CAAGATTCCA TTCCTTGGAA TCCGTGAGAC CAAGAACCCC AGGTCAGAGA 240 

ACACAAGGCT TGCCACCATG TTGGAAGCAG CCCACCACCA TTTTGGAAGC GGCCCGCCAC 300 

TATCTTGGGA GCTCTGGGAG CAAGGACCCC CAGGTAACAA TTTGGTGACC ACGAAGGGAC 360 

30 CTGAATCCGC AACCATGAAG GGATCTCCAA AGCAATTGGA AATGTTCCTC CCAAGGCAAA 420 

AATGCCCCTA AGATGTATTC TGGAGAATTG GGACCAATCT GACCCTCAGA CAGTAAGAAA 480 

AAAAATGACT TATATTCTTC TGCAGTACCG CCTGGCCACG GATATCCTCT TCAAGGGGGA 540 

GAAACCTGGC CTCCTGAGGG AAGTATAAAT TATAACACCA TCTTACAGCT AGACCTGTTT 600 

TGTAGAAAAG GAGGCAAATG GAGTGAAGTG CCATATTTAC AAACTTTCTT TTCATTAAAA 660 

35 GACAACTCGC AATTATGTAA ACAGTGTGAT TTGTGTCCTA CAGGAAGCCC TCAGATCTAC 720 

CTCCCTACCC CGGCATCTCC CTGACTCCTT CCCCAACTAA TAAGGACCCA CTTCAGCCCA 780 
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AACAGTCCaA aAuGACAaAG 

(2) INFORMATION FOR SEQ ID NO: 169: 
(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

consensus (41/68-1 + 42/68-1 + cl43 68-1) 



(2) INFORMATION FOR SEQ ID NO: 170: 
(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 438 base pairs 

( B ) TYPE : nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
(ii) TYPE DE MOLECULE: ADNc 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 
TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 

25 ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAGATAG CCAGACCATT AAATACACGA 360 
ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 420 
GCTTTCCAGG CCCTAAAG 438 



30 (2) INFORMATION FOR SEQ ID NO: 171: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

\ ^ / 

35 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 



10 
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GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 

TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 

GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 

TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 

ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 

GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAGTAG CCAGACCATT AAATACACGA 360 

ATTAAGGAAA CTCAAAAAGC CAGTACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 400 

GCTTTCCAGG CCCTAAAG 438 



(2) INFORMATION FOR SEQ ID NO: 172: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

15 <C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 
GACTTGAGCC AGTCYTCATA CCTGGACAYT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 
20 TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 
ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAATAG CCAGACCATT AAATACACGA 360 
25 ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACATCTGA AGCAGAAGTG 400 
GCTTTCCAGG CCCTAAAG 438 



(2) INFORMATION FOR SEQ ID NO: 17 3: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 



# 
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KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTPEAEV AFQALK 146 

(2) INFORMATION FOR SEQ ID NO: 174: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

10 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174; 
DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKVARPLNTR IKETQKASTH LVRWTPEAEV AFQALK 146 

15 

(2) INFORMATION FOR SEQ ID NO: 175: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
DLSQSSYLDX LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
25 KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTSEAEV AFQALK 146 

(2) INFORMATION FOR SEQ ID NO: 176: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

consensus ( 1/46-7+8/46-7+C15/46/7 ) 
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(2) INFORMATION FOR SEQ ID NO: 177: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNC 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

10 GACTTGAGCC AGTCCTCATA CCTGGACATT. CTTGTTCTTC AGTATGGGGA TGACTTAATT 60 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGCGCTCTT AAATTTCCTT 120 
GCTACCTGTG GCTCCAAACA AAAGGCTCAC CTCTGCTCAC ACCAGGTTAA ATACTTAGGG 180 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 
TATCCTCATC CCATAACCCT AAAGCAACTA AGAGGGTTCC TTGGCATATC AGCCTTCTGC 300 

15 CGAATATGGA TTCCCGGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AATTAAGGAA 360 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 
GCCCTAAAG 429 

(2) INFORMATION FOR SEQ ID NO: 178: 
20 (i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
25 (ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATAGGGA TGATTTAATT 60 

ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGTGCTCTT AAATTTCCTC 120 

GCTACCTGTG GCTCCAAACA AAGGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 

3 0 CTAAAATTAT CCAAAGTCGC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGAT 240 

TATCCTCATC CCAAAACCAT AAAGCAACTA AGAGGGTTCC TTGGCATAAC AGCCTTCTGC 300 

CGAATATGGA TTCCCCGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AGTTAAGGAA 360 

ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AGACAGAAGT GGCTTTCCAG 420 
GCCCTAAAG 



429 



35 



(2) INFORMATION FOR SEQ ID NO: 179: 




(i) SEQUENCE CHARACTERISTICS? 

(A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 
GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCCTC AGTATGGGGA TGATTTAATT 60 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAC CAAGCCACCC AAGCGCTCTT AAATTTCCTC 120 

10 GCTACCTGTG GCTCCAAACA AAAGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 
TATCCCCATC CCAAAACCCT AAAGCAACTA AGARGGTTCC TTGGCATAAC AGCCTTCTGC 300 
CGAATATGGA TTCCCAGATA CAGCGAAATA GCCAGGCCAT TATGTACATT ATCTAAGGAA 360 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 

15 GCCCTAAAG 429 



(2) INFORMATION FOR SEQ ID NO: 180: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear - 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 
25 DLSQSSYLDI LVLQYGDDLI IATHSETLWH QATQALLNFL ATCGSKQKAH 50 
LCSHQVKYLG LKLSKVTRAL REERIQRILA YPHPITLKQL RGFLGISAFC 100 
RIWIPGYSEI ARPLCTLIKE TQKANTHIVR WTPETEVAFQ ALK 143 



(2) INFORMATION FOR SEQ ID NO: 181: 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
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DLSQSSYLDI LVLQYRDDLI IATHSETLWH QATQVLI^NFL ATCGSKQRAQ 50 
LCSQQVKYLG LKLSKVARAL REERIQRILD YPHPKTIKQL RGFLGITAFC 100 
RIWIPRYSEI ARPLCTLVKE TQKANTHIVR WTPETEVAFQ ALK 143 

5 (2) INFORMATION FOR SEQ ID NO: 182: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
DLSQSSYLDI LVPQYGDDLI IATHSETLWH QATQALLNFL ATCGSKQKAQ 50 
LCSQQVKYLG LKLSKVTRAL REERIQRILA YPHPKTLKQL RXFLGITAFC 100 
15 RIWIPRYSEI ARPLCTLSKE TQKANTHIVR WTPETEVAFQ ALK 143 

(2) INFORMATION FOR SEQ ID NO: 183: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 
25 GGCCAGGCAT CAGCCCAAGA CTTGA 

(2) INFORMATION FOR SEQ ID NO: 184: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 
35 TGCAAGCTCA TCCCTSRGAC CT 
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(2) INFORMATION FOR SEQ ID NO: 185: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE : ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185 
GACTTGAGCC AGTCCTCATA CCT 

(2) INFORMATION FOR SEQ ID NO: 186: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186 
CTTTAGGGCC TGGAAAGCCA CT 



207 



TABLE NO. 5 



u. 5 



2 | 

CD ° 



eg c 
o 
to 



tu 

Q 



o 
o 



co 

<D 
Q. 



o 
o 



c 
o 



o 

O * 



oc 

Ui 



CO 



tr 



o 



C5 



> 
CD 
UJ 



3 



— 1 « 

' — ■ o 

5 e 



3> ^ O 

= g « 

© o « 

O 

•n 0 

3 12 C 
X O CO 

til 

2 " S- 

"III 

*° "V ^» S 



CD 



13 « 



- t 

3: „- 



g- 5 



<0 - 

CD £ 

* 1 

fc -o 



o 

«7> 



= CO 



2 CO CO 



o O 




TABLE No. 6 



UJ 

in 
<: 

UJ 

o 



I 



c 
o 

£ 

G 

s 2 



ro 



° 



o 
c o 

o >- 



o 



in 

§ 

u. 
c 

CO 



ft 



© S 



a 2 



CO 



1- 



cvl 



6 



C9 



5 



2 



I* 



CO 



-r~ cn 

= 2 
E «> 



H — 



3 o 



5c o 



Is 



2 



2 



^ o 



3 " 



CM 



CO 
2 



o 



2 

o 



5 



2 




CLAIMS 

1. Nucleic material, in the isolated or 
purified state, comprising a nucleotide sequence selected 

5 from the group including sequences SEQ ID NO: 93, SEQ ID 
NO: 94, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers , at least 
50% and preferably at least 60% homology with said 
10 sequence SEQ ID NO: 93, SEQ ID NO: 94 and their 
complementary sequences, excluding HSERV-9 sequence. 

2. Nucleic material of claim 1, nucleotide 
sequence of which is selected from the group including 
sequences SEQ ID NO: 93, SEQ ID NO: 94, their complementary 

15 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequence SEQ ID NO: 93, SEQ ID NO: 94 
and their complementary sequences. 

20 3. Nucleic material, in the isolated or 

purified state, coding for any polypeptide displaying, for 
any contiguous succession of at least 30 amino acids, at 
least 50%, preferably at least 60 %, and most preferably 
at least 70% homology with a peptide sequence encoded by 

25 any nucleotide sequence selected from the group including 
SEQ ID NO: 93, SEQ ID NO: 94 and their complementary 
sequence. 

4. Nucleic material, in the isolated or 
purified state, of retroviral type, comprising a 

30 nucleotide sequence identical or equivalent to at least 
part of the pol gene of an isolated retrovirus associated 
with multiple sclerosis or rheumatoid arthritis. 

5. Nucleic material as claimed in claim 4, 
said nucleotide sequence being 80 % homologous to said at 

35 least part of the pol gene. 
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6. Nucleic material comprising a nucleotide 
sequence identical or equivalent to at least part of the 
pol gene of an isolated virus encoding a reverse 
transcriptase comprising an enzymatic site comprised 

5 between the amino acid domains LPQG and YXDD, said virus 
having a phylogenic distance with HSERV-9 of 0.063 + 0.1, 
and preferably 0.063 ±0.05. 

7. Nucleotide fragment comprising a nucleotide 
sequence selected from the group including SEQ ID NO: 93, 

10 SEQ ID NO: 94, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 50% and preferably at least 60% homology with 
said sequences and their complementary sequences, said 

15 group excluding SEQ ID NO:l, and said nucleotide fragment 
not comprising nor consisting of the sequence HSERV-9. 

8. Nucleotide fragment of claim 7, nucleotide 
sequence of which is selected from the group including SEQ 
ID NO:93, SEQ ID NO: 94, their complementary sequences and 

20 their equivalent sequences, in particular nucleotide 
sequences displaying, for any succession of 100 contiguous 
monomers, at least 70% and preferably at least 80% 
homology with said sequences and their complementary 
sequences. 

25 9. Nucleotide fragment comprising a coding 

nucleotide sequence which is at least partially identical 
to a nucleotide sequence selected from the group 
including : 

SEQ ID NO: 93, SEQ ID NO: 94; their complementary 
30 sequences ; their equivalent sequences, in particular 
homologous to SEQ ID NO: 93, SEQ ID NO: 94; 

sequences encoding at least part of the peptide 
sequence defined by SEQ ID N0:95; 

sequences encoding at least part of a peptide 
35 sequence equivalent, in particular homologous to SEQ ID 
NO: 95, which is capable of being recognized by sera of 
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patients infected with the MSRV-1 virus, or in whom the 
MSRV-l virus has been reactivated. 

10. Nucleic acid probe for the detection of a 
virus associated with multiple sclerosis or rheumatoid 

5 arthritis, characterized in that it is capable of 
hybridizing specifically with any fragment according to 
any one of claim 7 to 9. 

11. Probe as claimed in claim 10, consisting of 
between 10 and 1,000 monomers. 

10 12. Primer for the amplification by 

polymerization of an RNA or a DNA of a viral material 
associated with multiple sclerosis or rheumatoid 
arthritis, comprising a nucleotide sequence identical or 
equivalent to at least one portion of the nucleotide 

15 sequence of a fragment as claimed in any one of claims 7 
to 9, in particular a nucleotide sequence displaying, for 
any succession of at least 10 contiguous monomers, 
preferably 15 contiguous monomers, more preferably 18 
contiguous monomers and even most preferably 20 contiguous 

20 monomers, at least 70% homology with at least the said 
portion of the said fragment. 

13. Primer as claimed in Claim 12, comprising a 
sequence selected from the group consisting of SEQ ID NO: 
99 to SEQ ID NO: 111. 

25 14. Polypeptide encoded by any open reading 

frame belonging to a nucleotide fragment as claimed in any 
one of claims 7 to 9. 

15. Polypeptide of claim 14, characterized in 
that the open reading frame encoding it, is comprised, in 

30 the 5 '-3* direction, between nucleotide 18 and nucleotide 
2304 of SEQ ID NO: 93. 

16. Polypeptide according to claim 15, 
comprising a peptide sequence at least partially identical 

to SEQ ID NO: 95. 
35 17. Polypeptide, comprising a peptide sequence 

at least partially identical to SEQ ID NO: 96. 




18. Polypeptide of claim 17 exhibiting an 
enzymatic activity consisting of proteolytic activity. 

19. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5* -3' direction, 

5 at nucleotide 18 and ends at nucleotide 340 of SEQ ID 
NO:93. 

20. Polypeptide exhibiting an inhibitory 
activity on the proteolytic activity of polypeptide of 
claim 18. 

10 21. Polypeptide, comprising a peptide sequence 

identical or equivalent to SEQ ID NO: 97. 

22. Polypeptide of claim 21, comprising a 
peptide sequence identical or equivalent to SEQ ID NO: 98. 

23. Polypeptide, characterized in that the open 
15 reading frame encoding it begins, in the 5 1 -3 • direction, 

at nucleotide 341 and ends at nucleotide 2304 of SEQ ID 
NO:93. 

24. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5»-3' direction, 

20 at nucleotide 1858 and ends at nucleotide 2304 of SEQ ID 
NO: 93. 

25. Polypeptide of claim 21 or 23, exhibiting a 
reverse transcriptase activity. 

26. Polypeptide of claim 22 or 24, exhibiting a 
25 ribonuclease H activity. 

27. Polypeptide exhibiting an inhibitory^ 
activity on the reverse transcriptase activity of 
polypeptide of claim 25. 

28. Polypeptide having an inhibitory activity 
30 on the ribonuclease H activity of polypeptide of claim 26. 

29. Antigenic polypeptide recognized from the 
sera of patients infected with the MSRV-1 virus, and/or in 
whom the MSRV-1 virus has been reactivated, characterized 
in that its peptide sequence is at least partially 

35 identical or is equivalent to a sequence selected from the 
group consisting of SEQ ID NO: 95, and fragments thereof, 
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in particular SEQ ID NO: 96, SEQ ID NO: 97 and SEQ ID NO: 
98. 

30, Mono- or polyclonal antibody directed 
against the MSRV-1 virus, characterized in that it is 
5 obtained by the immunological reaction of a human or 
animal body or cells to an immunogenic agent consisting of 
an antigenic polypeptide of claim 29. 

31 ♦ Reagent for detection of the MSRV-1 

virus, or of an exposure to the said virus, characterized 

10 in that it comprises at least one reactive substance 
selected from the group consisting of a probe as claimed 
in claim 10 or 11 ; a polypeptide as claimed in any one of 
claims 14 to 29 ; or an antibody as claimed in claim 30. 

32. Diagnostic, prophylactic or therapeutic 

15 composition, in particular for inhibiting the expression 
of a virus associated with multiple sclerosis or 
rheumatoid arthritis, and/or the enzymatic activity of the 
proteins of said virus, said composition comprising a 
nucleotide fragment of any one of claims 7 to 9. 

20 33. Diagnostic, prophylactic or therapeutic 

composition comprising a polypeptide of any one of claims 
14 to 29, or an antibody of claim 30. 

34. Process for detecting a virus associated 
with multiple sclerosis or rheumatoid arthritis, in a 

25 biological sample, characterized in that an RNA and/or a 
DNA presumed to belong or originating from said virus, or 
their complementary RNA and/or DNA, is/are brought into 
contact with a nucleotide fragment according to any one of 
claim 7 to 9. 

30 35. Process for detecting the presence or 

exposure to a virus associated with multiple sclerosis or 
rheumatoid arthritis, in a biological sample, wherein said 
sample is brought into contact with a polyeptide, 
according to any one of claim 14 to 29, or an antibody of 

35 claim 30. 
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TCAGGGATAGCCCCCATCTATTTGGCCAGGCATTAGCGC^GA^GACTC 

AATTCTCATACCTGGACACTCTTGTCCTTCAGTACATGGATGATTTACTTT 

^CTCGCCCGTTCAGAAACCTTGTGGCATCAAGCCACCCAAGAACTCTTAA 

CTTrCCTCACTACCTGTGGCTACAAGGTTTCCAAACCAAAGGCTCGGCTCT 

GCTCACAGGAGATTAGATACTNAGGGCTAAAATTATCCAAAGGCACCAGG 

Sct^gtgaggaacctatccagoctatactggcttatcctcatcccaaa 

SCTAAAGCAACTAAGAGGC^CCTrGGCATAACAGGTTTCTGOCGAAA . 

ACAGATTCCCAGGTACASCOCAATAGCCAGACCATTATATACACTAATTA 

NGGAAACTCAGAAAGCCAATACCTATTTAGT/VAGATGGACACCTACAGAA 

g^gWccaggccctaaagaaggc^^ 

^G^CAACAGGGCAAGATTTTTCTrrATATGCCACAGAAAj^^^A^GAj^ 
AGCTCTAGGACTCCTTACGCAGGTCTCAGGGATGAGCn-GCAACCCGTGGT 

ATACCTGAGTAAGGAAATTGATGTAGTGGCAAAGGGTT 
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SEQ ID NO 46 (FBd3) 

^rrrrATTGGTGTATITACAATCCnTATCTAATCCGAAATGCCCATGTTG 
GTGCTG ATTGGTO ,1- A ^1 1 1 CCTAACCTCT GG GGGAACCCCCATTAAA 

ggtggaag^S 

rA^AGCATAAGTGGCTACAGAGGCAAGGAAAGACT^ 

===== 

aaa ArrTATAATTGATAACTGAAGGTCTrCrCTGTAACCCTGTAACACl^ 
^^OTOTGTCAAGTGTAAACAAGGGCGTAGCCCAAAA 

^CCTA^^^^ 

eAGACTTAGGAGGAATTCCCTTCAGGACGGGAAGATAGATGCTTCCTCCCA 



A 



GGCCtAGACCTCCT^^CT 
CTA^A^CA^CCTCTG^AfflTGGGCAACATGGTrTCTTCCCTITCTATGTCCC 



Ti^r-^^iTrxrnCrATrACTCGCCTTTGGGCCCTGTAl 1 1 1 1 AACCTCC 

S^H^A^^^^GCCCAATrcCCA^G^SS^CTGGGGTGTCCCGTTTAGACT 
GTGTCTAGCTAAAGGATi U 1 AAA i A^n^nn . 
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SEQIDN0 51 (tpol) 



rfirTGCTAAAGGAGACTTGTGGTTGTCAGACAATCGCCTACTTAGGTACCA 

taacccaaagttatgctgcccagaaggatcttttagaggtccccttagcca 

ACCCTGACCrcAACCTATATATATACTGATGGAAGTTCGTTTGTAGAAAAG 

ggaSacaaIg^naggatatnc^^ 

aaaSaagotcttccccccagggaccagcgcccccgttagcagaactact 

ggcactg^a^cS^ga^^otagaacttggaaagggaggaggataaatgtgt 

^cSatagcaagtatgcttatctaatccgaaa^ 
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SEQ ID NO 52 (JLBcl) 

Trrnrrrr a nnr ACTGGCCC A AQ ATCTAGGG A 



^ACTCTAGATCTCrrGAACITrCTAGCT 

GCTTATCCTCACCCTAAGACATTAAAACAGTTGCGGGGGTTCC 



ACTCGCT^TTTGGTCACTATGGATTCCCAGATACAGCAAGATTXjGCAG^ 



^Icaa^ccSaag^ 
™™^a^c^tacta^ 

S^^^^^^rxl^GATACCAG^GACTACTCCTGGAGGATIXKKKnTCAAG 
^ A ^^5SS^ww-r^ArC^^GCCACriTTCCTCCAGAGGATGGAGAG 

===== 

SS^^^^^^A^A^ 
^^^A^^A^^ACCCTCCTCCAGGGGATTG^GA^ 

SSS^. A ^=aSa^^C 

AAGGGAACACTA* 



GAOOTTTTATTAATAOCOA^GGAGA^^AAATC^ 

AATCTATCTGGTAAGGATAGTAACTACA ATCCC a/v^ .m^.^...^^ 
CAGTGACTCTC 
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SEQ ID NO 53 (JLBc2) 

ccccatctatttgatcaggcactagcccaagatctaggcc 



A^^AAOTC'CAGGCATTCTAGTCCTTCAGTAT^ 
ACTTCTGAAu l ^« w ^ _ \GGCTACTTGAGA1 

taaattgaaagtc 
Sctacaacaagtcaaatatctaggcctaatcttagatagaagaaccaggg 



VGCCTCATGCCAGCAGGCTACTTGAGATCTCTTGAA 

ctt^agctaat^agggtgtatggcatctaaattc 



GGCTACCAGTTTGGAAC 



a ArrCTTCoSACCTTAAAGGAGACCCTAGTACAAGCrCCAGCTTTAA 

rGTTT 



TAGCTCCTGGAGTCCnTACTCAGACITTTGG^ 

RTACCTAAGTAAGGAAATTGATGTAGTAGCAA^AGGCTGGCCTCACTC 



A^GGTAGTrGCGGCTGTGGCAGTCTrACTGTCAAAGGCTATCAAAATAAT 
I^S ^ a Y^^ACTATCTGGACTACTCATGAGGAAAATGGCATATT 

^r^ACTGA^GA^ACCAGTGCnTTAAATA^ 

^^aSctgccactcttctcccagaagatgga 

A rTGTC A AC^A ATTAGAGTCC AG A GTT ATG CTGCCTG AG AGG ATCTCTTAG 
A A^TCCCCTTA^CTA ATCCTG ACCTT A ACCT ATATGCTG ATGG AAGTTCAC 

^Igctggcagt^cactgcctgaagctatggggaaggagagagg^ 

^P^P^rQ^^YAAGT^GCTAGCAGAGGCAGCGAAAGACTAGCAGAGAGGA 
r APG^AG^GG^^AG/X^^AAAGTCAAAGAAAAGAAGTCAAAGACAGACA 

g^g^Sgac^a^ 

^ArAY^ATG^^CAAGAACAGAAGAGAGAGGCAGCGCC^GAAGAGTrAAG 
aaa^GA^A^A^TC 



a^t^^oa a Af^TTTTrAACAAAAGTAAAGTTTGCTGAAAGTTAG 
GGCAGCAGTGACTCTC 
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1 TTCCTGAGTT CTTGCACTAA CCTCAAATGA GAGAAGTGCC GCCATAACTG CAACCCAAGA 
61 GTTTGGCGAT CCCTGGTATC TCAGTCAGGT CAATGACAGG ATGACAACAG AGGAAAGATA 
121 ATGATTCCCC ACAGGCCAGC AGGCAGTTCC CAGTGTAGAC CCTCATTAGG ACACAGAATC 
181 AGAACATGGA GATTGGTGCC GCAGACATTT GCTAACTTGC GTGCTAGAAG GACTAAGGAA 
241 AACTAGGAAG ATATGAATTA TTCAATGATG TCGACTATAA CACAGGGGAA AGGAAGAAAA 
301 TCCTACTCCC TTTCTGGAGA GACTAAGGGA GGCATTGAGG AAGCATACCA GGCAAGTGGA 
361 CATTGGAGGC TCTGGAAAAG GGAAAAGTTG GGAAAAGTAT ATGTCTAATA GGGCTTGCTT 
421 CCAGTGTGGT CTACAAGGAC ACTTTAAAAA AGATTGTCCA ATAGAAATAA GCCACCACCT 
481 CGTCCATGCC CCTTATGTCA AGGGAATCAC TGGAAGGCCC ACTGCCCCAG GGGATGAAGG 
541 TCCTCTGAGT CAGAAGCCAC TAACCAGATG ATCCAGCAGC AGGACTGAGG GTGCCCGGGG 
601 CAAGCGCCAG CCCATGCCAT CACCCTCACA GAGCCCCAGG TATGCTTGAC CATTGAGGGT 
661 CAGAAGGGTA CTGTCTCCTG GACACTGGCG GGCCTTCTCA GTCTTACTTT CCTGTCCTGG 
721 ACAACTGTCC TCCAGATCTG TCACTGTCCG AGGGGTCCTA GGACAGCCAG TCACTAGATA 
781 CTTCTCCCAG CCACTAAGTT GTGACTGGGG AACTTTACTC TTCCACATGC TTTTCTAATT 
841 ATGCCTGAAA GCCCCACTCT CTTGTTAGGG GAGAGACATT CTAGCAAAAG CAGGGGCCAT 
901 TATACATGTG AATATAGGAG AAGGAACAAC TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 
961 TAATCCTGAA GTCCGGGCAA CAGAAGGACA ATATGGACAA GCAAAGAATG CCCGTCCTGT 
1021 TCAAGTTAAA CTAAAGGATT CCACCTCCTT TCCCTACCAA AGGCAGTACC CCCTCAGACC 
1081 CGAGACCCAA CAAGAACTCC AAAAGATTGT AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 
1141 ACCAAGCAAT AGCCCTTGCA AGACTCCAAT TTTAGGAGTA AGGAAACCCA ACGGAC 
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GATGCCTTTTTCTGCATCCCTGTACGTCCTGACTCTCAATTCTTGTTTGCCTTTGAAG 
ATCCTTTGAACCCAACGTCTCAACTCACCTGGACTGTTTTACCCCAAGGGTTCAGGGA 
TAGCCCCATCTATTTGGCCAGGCATTAGCCCAAGATGCCTTTTGCATCCCTGTACGTG 
ACTCTCAATTCTTGTTTGCCTTTGCCTTTGAAGATGCTTTGAACCCAACGTCTCAACT 
CACCTGGACTGTTTTACGCCAAGGGTTCAGGGATAGCCCC<^TCTATTTGGC 

SEQ ID NO 40 

CAGGCATTAGCCCAA 



Asp-Ala-Phe-Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe- 
Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn-Pro-Thr-Ser-Gln-Leu- 
Thr-Trp-Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His- 

Leu-Phe-Gly-Gln-Ala-Leu-Ala-Gln 
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FIG. 34 



Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe-Leu SEQ ID NO 41 



Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His-Leu-Phe-Gly- 

SEQ ID NO 42 

Gin-Ala-Leu-Ala 



Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu SEQ ID NO 

Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn SEQ ID NO 44 
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CTTOGCCAAC OOOUL'ITUJA AOOCAAACAG TOCAftAAQGA 50 

L P Q L I R T PLS TQTV QKD 
FPN ..GP PF Q PKQ SKRT 
SPT NKD PPFN PNS PKG 

CAT&GACAAA GGSCTAAACA ATCAAOCAAA GAGTGCCAAT ATKJOCIGGT 100 
IDK GVNN EPK SAN IPWL 
. T K E.T MNQR VPI FPG 
HRQR SKQ . T K ECQY SLV 

TAUGCftOOCT OCAfiGOQGTG GGAGAAGAAT TCX3Q00CAGC CAGAGTGCAT 150 

CTL Q A V GEEF G P A RVH 
YAPS KRW E K N S A Q P E C M 
MHP PSGG RRI RPS QSAC 

gtagcttttt cicicic ft CA. ctigaagcaa attaaaaiag tosieaqctna. 200 

VPFS LSH LKQ IKID XGX 
YLF LSHT . S K LK. T.VN 
TFF SLT LEAN .NR XRX 

ATINICftGAT AGXCIGATG GYTATATIGA TCITTEACftA G3ATEAG3AC 250 
XSD SPDG YID VLQ GLGQ 

XQI ALM XILM FYK D.D 
IXR. P . W L Y . CFTR I R T 

AftlCCTTIGA TCIGZOIGG AGAGAEATAA TRTEfiCTGCT AAATCAGAOG 300 

S F P L T W R D I I L L L N Q T 
N PL I .HG E I. YYC. IRR 
IL. SDME RYN ITA KSDA 

CTAftOCTCaA ATGAGAGAAG TGCT3XMA ACrGGAGOCC GAGAGTTTGG 350 
LTSN ERS A A I TGAR EFG 
.PQ MREV LP. LEP ESLA 
NLK . E K CCHN W SP RVW 

CAATCICIOG TA3CICAGTC AGGTCAAIGA TAGGATGACA AOGIGAQGAftA 400 
NLW YLSQ VND RMT TEER 

ISG ISV RSMI G.Q RR-K 
QSLV SQS GQ. . D D N GG<"K 

GAGAACGATT G00CACAG3G CAGCAQQCAG TIOCCAGIGT AOJlUJiVJflT 450 

ERF PTG QQAV PSV APH 
ENDS PQG SRQ FPV. LLI 
RTI P H K A AV»o o u v. ^ ^ - 

TG33AC30G AAJCAGAACA TQGAGATTGG TQOOQCAGAC ATTTA 495 
WDTE SEH GDW CRRH L 
GTQ NQNM EIG AAD I 
GHR IRT WRL.V PQT F 
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LPQL IRT PLS TQTV QKD 

CATAGACAAA GG?*3IAAftCA AIG&AOCAAA ATICXXTOOT 100 

IDK GVNN EPK SAN IPWL 

TA3GCAOQCT CCA&GQQOT3 GG&GAftGftRT TO3300C&3C CAGAGK3CAT 150 
CTL Q A V GEEF G P A RVH 

GEffiDLTlTiT CTCTC1CACA CTIG&MCAA MTAAAATAG AC3CEftQ3ISA 200 
VPFS LSH LKQ IKID LGK 

ATICIOGAT KHCTMG GYTAE7OTGA. TOITTIftCAA GGAITOOGAC 250 
FSD SPDG YID VLQ GLGQ 

A&TOCTTIGA TCTCftCATQS A3AGAIAIEAA TMTACIGCT AAATCAGMG 300 
SFD LTW RDII LLL N Q T 

CTAAOCTCAA ATCAGAGA2G TOCIGOCMA ACIGGftGOQC GMftGTiTOG 350 
LTSN ERS A A I TGAR EFG 

CAAIUICT33 A33ICAKTGA TAQ3MGACA AG3GAGGAAA 400 

NLW YLSQ VND RMT TEER 

G&3AAQGATT OOC3C?^A3QS CftGCAGGJCAG TTIOOCftGIGrr A3CT0CICAT 450 
ERF PTG QQAV PSV APH 

TOGSGATTGG TOCOGC2£&C ATITACAACT 500 
WDTE SEH GDW CRRH LQL 

TOCCTOCTW AftQGftCINM GAAAACIMG AMftCEMCA ATTKTICAAN 550 
ACX KDXG KLG RLX IIQX 

G^TOTOCACT ANISPOO*3S GGftMGG&B3 AAAATOCTAC TOCCTITCTG 600 
CPL XHR GKEE NPT AFL 

GMW3ACEAA Q3GAG3ZATT GftGG&ftGCM 1 AXMGCAfiG TQSACKPIGG 650 
ERLR E A L RKH TRQV DIG 

*03CICIQGA AAAG3SAAAA GTIGGQGAAA. T3ATATO0CT AAIAS3QCTT 700 
GSG KGKS WAN YMP NR AC 

GCTTCOGIG OOCTftCAA GGftCQCTra GAAAftGMTG TOCAAOTA3A 750 
FQC S L Q G RFR KDC PSR 

AMAMOCGC OXlUGKCCft TOUOajriffl* GICafiGgSftA TCACTOGftftG 800 
N K P P XiVH APY V K G I TGR 

QQC33£2QCC OCM33GAD3 AMGflOCTCT GftGTCCBGftAG OCACIAAOCT 850 
PTA PGDE GPL SQK PLT. 
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FIG 38 AMGAAACIC AGAAAGQCAA TACCCATITA GTAftGATOGk 50 
KETQ KAN THL V R Vv T PEA 
R K L R K P I PI. .DGHQKQ 
GNS ESQ YPFS KMD TRS 

^GAAGCAGCT TTXfiOXEC TAAftGAAftTC GCDOCftGICT 100 

E A A FQAL KKS LTQ APVL 
KQL S RP . R N P .PK PQC 
RSSF PGP KEI PNPS PSV 

TAMCTIQCC Aa03Q33CAA GftCTTTICIT TATMGIGAC AGAAAAftCEG 150 
SLP TGQ DFSL YVT EKQ 
ACQ RGK TFL YMSQ KNR 
K LA NGAR LFF ICH R K TG 

GftftlftGTICT AGSfiGTOCTT ACftCftGGICC AAGGGACAAG CTIQCAftCCT 200 
E.L. ESL HRS KGQA CNL 
NSS RSPY TGP RDK LATC 
IAL GVL TQVQ GTS LQP 

GIQQCATACC TGAGTA&QGA AZOGAT3TA NIQQCAAAGG GTIQQCCICA 250 
WHT .VRKLMXWQRVGLI 
GIP E.G N.CX GKG LAS 
VAYL SKE TDV XAKG WPH 

TTGTITACfiG GTAQQGCftGC AGTCAQCfiGIC TTftGITTCTS AAACAGITAA 300 

VYR .GS SSSL SF. N S . 
LFTG R A A VAV LVSE T VK 
CLQ VGQQ .QS . F L KQLK 

AATAftTACfiG QGAAGAGATC TEOGIGTO GACMCICftT GAIUTCAAOS 350 
NNTG KRS Y C V D I S . C E R 
IIQ GRDL TVW TSH DVNG 
.YR EEI LLCG H LM M.T 

QCATACTCAC TGCTAAAGAG GACTIGTGQC TGICAGfiCAA CCATTEfiCTT 400 
HTH C.RG LVA VRQ PFT, 
ILT AKE DLWL SDN HLL 
AYSL LKR TCG CQTT IYL. 

AAATAGCfiGG TICTKTERCT TGMGIGOCA GIQCIQCGAC TGCfiCAITIG 450 

IAG SIT .SAS A A T AHL 
K QV LLL EVP VLRL HIC 
NSR FYY L KCQ CCD CTFV 

TGCAftCTCTT AACCCA3CCA CMTTCTTOC AGACAATGAA GAAAAGATAG 500 
CNS. PSH ISS RQ-R KDR 
ATL NPAT FLP DNE EKIE 
QLL TQP HFFQ TMK KR. 
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p |VJOO AftCAIftftCIG TCaACAAGm MTuCBJAAA. uuimuuiuC lU^flijQQdHC 550 

bT.L ST SN CSN LCC SRGP 
HNC QQV IAQT Y A A RGD 
NITV NK. LLK PMC'L'EGT 

CITCIRGfiGG TICJOCTKaftC TGAIGOOSAC CTCAftCTIGT ATACIGATQG 600 

SRG SLD .SRP QLV Y.W 
LLEV PLT DPD LNLY TDG 
F.R FP.L IPT STC ILME 

Aftb TlU L' l ' lG QCftGA&AAflG GflCTTTGftAA AGQQQQGTAT GCAGTGATCA 650 
KFLG RKR TLK SG VC SDQ 
SSL A E K L.K AGY AVIS 

V P W Q K K D F E K R G M Q . S 

GTGATAAIGG AAIACITGAA AGIAAT030C TCACICCAQ3 AACIftGIGCT 700 
. .W NT.K .SP HSR N.CS 
D N G ILE SNRL TPG TSA 

VIME YLK V IA SLQE L VL 

CADCIG3CRG AACIAMW3C (XTCftCTIGG GCACTftGAAT TftGGftGAfiGG 750 

PGR TNS PHLG TRI RRR 
HLAE L I A LTW A L E L G E G 
TWQ N. .P SLG H.N EKE 

AAAAAGGGm CftGACICIAA GIMQCTIAC CIAGTOCTOC 800 

K K G K . Y I F ...R ...L . • V C L P _S _ P . P 
KRV NIYS DSK YAY LVLH 
KG.IYIQT LSMLT.SS 

ATQOOCMGC AGCAATKIGG AGAGfiGflGQG AATIOCIAAC TIL'IGfiGQGA 850 
CPC SNME REG IPN F.GN 
AHA AIW RERE FLT SEG 
MPMQ QYG ERG NS.L LRE 

PCPCCnmCk AOC2ATCAQQG AAQCCKTEfiG GftGATEATIA TIOQCTSEfiC 900 

TYQ PSG KPLG DYY WLY 
TPIN HQG SH. EIII GCT 
HLS TIRE AIR RLL LAVQ 

AGAAftOCIftA AGAGGIGGCA GICTIRCRCr GOCfiQQGICA TCBGGA8GAA 950 
RNLK RWQ SYT ARVI RKK 
ET. RGGS LTL PGS SGRR 
KPK EVA VLHC QGH QEE 

GAQGAAfiQQG AAATAGAMG CAATO3CCAA Q03GATATIG AftGCAAAAAA 1000 
RKG K.KA I A K RIL KQKK 
GKG NRR QSPS GY. SKK 
EERE IEG NRQ ADIE AKK 
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aGOOQCAAGG CAGGACIUIC CATEAGAAKT QCTTKEftGAA GGACaOCTBG 1050 

PQG RTL H.KC L.K DP. 
SRKA GLS IRN A Y R R TPS 
A A r QDSP LEM LIE GPLV 

TOGQCICIGG GAAAQCAMC OCCAGERCIC AGCftGGAAAA 1100 
YGVI PSG KPS PSTQ QEK 
MG. SPLG NQA PVL SRKN 
WGN PLW ETKP QYS AGK 

ATaGAATfiGG AAftCCTCACA AGGACATftCT TTCCIO00CT CCAGAIQQOT 1150 
NR KPHK DIL SSP PDG. 
R I G N L—T R . T Y^r-F... P. P. L Q M A 
IE E TSQ GHT FLPS RW L 
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NLRA RRT KEN .EDY ELF 
TCV LEGL RKT RKT MNYS 
FIG 39 LAC . K D • G K L GRL .II 

a CAA3GA3GIC C^ATAftCA GAAGAAAATC CTACTOOCTr 

NDV HYNT GER KKI LLPF 
MMS TIT QGKG RKS YCL 
Q.CP L.H RGK EENP T A F 

TCTOGPG?^A CIAAG33AGG CAJTCAGGAA GCAmCCAGS CAAGIQ3ACA 

WRD .GR H.GS I PG KWT 
SGET KGG IEE A Y Q A S G H 
LER LREA LRK HTR QVDI 

TIGGftGGCIC 1QGAAAAQ9G AAAAGTIGGG O^AATTGAAT G0CTAATAG3 
LEAL EKG KVG QIEC LIG 
WRL WKRE KLG KLN. A. .G 
GGS GKG KSWA N.M PNR 

GCTIGCnCC AGIGCAGICT ACA&GGAOQC TITAGAAAfiG ATIGICCAAG 
LAS SAV Y KDA LEK IVQV 
LLP VQS TRTL .KR LSK 
ACFQ CSL QGR FRKD CPS 

TAGAAATAflG CCQCX3DCICG TCCATQCCCC TTKKJDCAfiG GGftATCACTG 
EIS RPS SMPL MSR ESL 
K A APR PCP LCQG NHW 
Rn'k PPLV HAP YVK GITG 

GAAGQOClftC TG000CAGQG GftCGAMGTC CICTGRGICA GAAQCCftCTA 
EGLL PQG TKV L.VR SH. 
KAY CPRG RRS SES EATN 
RPT APG DEGP LSQ KPL 

aOCIGRTCAT CCSGCAGCAG GACTGftQQGT GOCCQGQQCA AGTKX3CAGCC 
PDD PAAG LRV PGA SASP 
LMI QQQ D.GC PGQ VPA 
T . . S SSR TEG A RGK CQP 

QflKXEKICA CCCTCWSfiGC CCCQQGTATG TTTGftCCATT GAGftGCCAGG 

CHH PQS PGYV .PL RAR 
H A I T L R A P GM FDH. EPG 
MPS PSEP RVC LTI ESQE 

AfiGITAftCIG TCICCIGGAC ACIG30GCAG CLT1CICBGT CITACriTCC 
KLTV SWT LAQ PSQS YFP 
S LSPGHWRSLLSLTFL 
VNC LLD TGAA FSV LLS 
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CT\(~* OQ T3TCCEW3AC AATIGTOCIC CAGATCIGTC ACTATCCGBG Q3GTOCIMG 550 
I lO VPD NC PP DLS LSE GS.D 

b SQT IVL QICHYPR6PK 

CPRQ LSS RSV TIRG VLR 

AC^GQCAGIC MTIACAIACr TCTCTCSGCC JOMGTICT GACTGQ3GAA 600 

SQ S LHT SLSH .VV TGE 
TASH YIL LSA TKL. L G N 
QPV TTYF SQP LSC DWGT 

CnTfiCICIT TICACMGCT TITCTftKITA TGQCJIGAASG C0X3OQ0C 650 
LYSF HML F.L CLKA PLP 
FTL FTCF SNY A.K P H S L 
LLF SHA FLIM PES PTP 

TUilTA GQGft. GBGACMTTT AGCAAAftGCA GQ3Q0CATm TACACCIGAA 700 
C.G ETF. QKQ GPL YT.T 
VRE RHF SKSR GHY TPE 
LLGR DIL A K A GAII HLN 

CATM3GAAAA GSAMMOCA Ti'lUL'lGTGC GCT3CTIGAG GAAGSARTTA 750 

.EK EYP FAVP C Li R KEL 
HRKR NTH L L S PA.G RN. 
IGK GIPI CCP L L E E G IN 

ATCCIGA8GT CIG3QCAATA GAftGGBCAKT ATGGftCA&QC AAAGAftSOOC 800 
I L K S GQ. KDN MDKQ RMP 
S.S LGNR RTI WTS KECP 
PEV WAI EGQY GQA KNA 

OUltL'lbTiC AaGTEAAflCT AA^33ATICT QOCTOCTITC (XSaCCAAaG 850 
VLF KLN. RIL PPF PTKG 

SCSS. TKGFCLLSLPK 
RPVQ VKL KDS ASFP YQR 

CTTAGSOX33 ^O^CICAAA 2GATIGITAA 900 

STL LDP RPYK DSK DC. 
EVPS TR GPT RTQK IV K 

K y p LRPE ALQ GLK RLLR 

QGfiOCEWA GCCCAftGQCC TAGTAAAfiCC ATGCflGTAGC OCCTGCAATA 950 
GPKS PRP SKT MQ.P LQY 
DLK AQGL VKP CSS PCNT 

_ , -kt tr A \T a O & T 

T . K f J\ . • " " " - " - 

CICCAOTITT AGGftGTAftGG A&ACCCftACG GRCftGIGGSG GITfiGTGCAA. 1000 
SNF RSKE TQR TVE VSAR 
PIL GVR KPNG QWR LVQ 
LQF. E.GNPTD S G G • C K 
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SQD Y. . GCFS SIP SCI 
DLRI INE A V F PLYP AVS 
ISG LLMR LFF LYT QLYL 

TAGCOCITAT AC1 U1ULT.lt (XCTAATACC AGAQGAfiGCA GRGTAGTITA 1100 
. PLY SAF P N T RGSR VVY 
SPY TLLS LIP EEA E.FT 
ALI LCF P.YQ RKQ SSL 

CAGICCTQGA CCITAAGGAT QCCILTl'ILT GCATCCCIGT iOTCCIGAT 1150 
SPG P.GC LFL HPC TS.F 
VLD LKD ASFC IP V HPD 
QS WT LRM PLS ASLY ILI 

TCICAATICT TCITTGICIT T3AAGAICCT TIGAACCCAA TCICICAATT 1200 

SIL VCL . R S F EPN VSI 
S Q F L F V F EDP LNPM SQF 
LNS CLSL KIL .TQ CLNS 

CACCIGSflCT GTTITfiCCCC AGQQGnCCG GGATAOaCCC OOCTATITG 1250 
HLDC FTP GVP G.PP SIW 
TWT VLPQ GFR DSP HL FG 
PGL FYP RGSG IAP IYL 

QCCAGQCATT AGCCCAftGAC TTGMCCftAT TCTCATRCCT GGACATCTIG 1300 
PGI SPRL EPI LIP GHLV 
QAL AQD LSQF SYL DIL 
ARH. PKT .AN SHTW TSC 

TCLT I ULJj JLA TGQGAT3ATT TAATTITAGC OOTGTICA GAAACCTTGT 1350 

LRY GMI .F.P PVQ KPC 
SFGM G.F NFS HPFR NLV 
PSV WDDL ILA TRS ETLC 

G0CA1CAW3C CAGCCAAGCG TICTEAAATT TOCTCftCICC GTOIGGCTOC 1400 
AIK'P PKR S.I SSLR VAT 
PSS HPSV LKF PHS VWLQ 
HQA T Q A FLNF LTP CGY 

ASGGTITOCA AACCAAAG3C TCAOCICTGC TCACAGCfiGG TEAAATACTT 1450 
RFP NQRL SSA HSR LNT. 
GFQ TKG SALL TAG .IL 
KVSK PKA QLC SQQV KYL 

A3QGTTAAAA TDOCCAAAG GCACCAQQQC CCICIGIGAG GAA1GTATCC 1500 

G.N YPK APGP SVR NVS 
RVKI IQR HQG PL.G MYP 
GLK LSKG TRA LCE ECIQ 
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AftCCICTRCT GQ L.TiiftU.UlT CMGOCAAAA OCCTAAAGCA ACTAAGAAGG 1550 
NLYW LIF IPK P.SN .EG 
TCT GLSS SQN PKA TKKV 
PVL AYL HPKT LKQ LRR 

T JL Tl GGCftT AACASJl'l'lC TOOOGAA I 577 
PWH NRFL P 
LGI TGF CR 
SLA. QVSAE 
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TCCAGCBGCA. QGftCIGfiGQG TO0CD33QQC AAGIGOCAGC CX2KD30CRTC 
SSSR TEG ARG KCQP MPS 

ACXXnCftGRG CXX033GTAT GTTIGACCftT 1GAGAGOCAG GAAGTEAftCT 
P S E PRVC LTI ESQ EVNC 

GICTCCIQGA CaCTCQCECA QCCTIdCAG TCTTftCTTTC CIGTOOZAGA 
LLD TGA AFSV LLS CPR 

CAATIGraCr CZOCATCCGA GQQGICCnAA GACftGQCAGT 

QLSS RSV TIR GVLR QPV 

CACTRCAUKC TICTCTCaGC CACTABGTIG TGACIQQGGA ACTTEftCTCT 
TTY FSQP LSC DWG TLLF 

TITCaCAIGC ' m'lC'JM TT ATGOCIGAAA Q000CACIGC LTlGTiJAGQS 
SHA FLI MPES PTP LLG 

AGAGACKTIT TSGCAAAfiGC AQ3330CAIT ATACftCCIGA ACAXAQGAAA 
RDIL AKA G A I IH L~N I G K 

A3GAATMCE ATTIGCIGTC O0CT3CTIGA GGAftGGAATT AATCCTGAfiG 
GIP ICCP LLE EGI NPEV 

TCIGOQCAfiT AGAfiGGftCAA TATGGftCAAG CAAAGAATGC GCGTOCT3TT 
WAI EGQ YGQA KNA RPV 

C^fiGITAAftC TAAM33ATIC TGDCIOCTIT 03CTACCAAA. QGA£Ci:A:'TH 
QVKL KDS A SF PYQR KYP 
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TCCftOCAOCA GGACIQCGG TCOTOOCC WCPOQCWr CCWGCCATC 
G A R G KCQP MPS 

grOTTTTAT GITTGftCXAT TGflGAGGCTC GAAGTTAACT 
PS E PRVC LTI ESQ EVNC 

GTCiaCTGGA CACIGOCGCA QCETICIOW5 TCTTACmC CTGTCOCAGA 
L L DTG A AFSV LLS CPR 

ayoTTirair ccagatctgt cactatoea aunuL'URG gaqgocagt 

QLSS RSV TIR G V L G Q P V 

OCTtfAITC TTCICICAGC OCTAAGTTG TCAL-IULUA ACTTTACTCT 
TTY FSQP LSC DWG T L L F 

Tl ' ICA CATGC TTTTCBATT A3SCCIGAVA QOCCCftCia: LTiuruOE 
SKA FLI MPES PTP LLG 



AGAGACATTC TAGCAAAAQC AGSOUXflTT ATACAOCTCA 
R D I L A K A G A I IHLN 



AQOTiOGAAA 
I G K 



AGGAAOAOCC Ai'nUL'ib'R: CCCTOCTEA GGftAOGAATT AKPtrTGAAG 
GI? ICCP LLE EGI NPEV 



' H ' j i ui s at AGAAGGACAA TOTGGACWC CAAAGAMEC COJ3CCIGTT 
WAX EGQ YGQA KNA RPV 

CAAGTTAAAC TAAAGGHTTC lULXlUL'l'lT GflCTAOMA GSAAGTAOGC 
QVKL KDS ASF P Y Q R K Y P 

tcttagmjj: GAGQCCCTAC AJGGMOCA AAfGATTOTT AAGGAOCBVA 
LR? E A L Q GXQ KIV K D I* K 

AAGdXAAGG GCmJEPAAA CCATGOGTA Q0QCCIGC7A TACICCAATT 
A Q G L V K PCSS PCN TPI 



T AGGA3TAA G5AAA0QCAA OG3ACAGTGG AGSTTAG7QC AAGHTCICTG 



GVR KPN GQW RLVQ 
region A 



D L R 



GATDCTAAT GAGGCTGrTT TJUCiUlWA GOOGCTGIA ' 
I I N EAVF PLY PAV SSPY 

ATAcnrccr ttooceaata co^gaggaag cagagtqqtt tactgtocig 

TLLSLI PEEAEWFTVL 

GAoenwas jcraxxmT csxxrarr umjuiiuu acictcaatt 

DLKD AFF CIP VRPD SQF 

crronTGo: tttgaagatc ctttgaaocc AAOGTCICAA CIQCCTOGA 
LFA FEDP LNP TSQ L T W T 

ciu ' iTiT AOC azAAaasnc agggatagcc arAicnar iwj :WTK 

VLP QGF RDSP H L F GQA 

TTAGOTAAG ACTTGAGTO ATTC1CMAC CKCflOCIC TTdCOTa 
L A Q D L S Q FSY LDTL VLQ 

GTACGTGGAT GATITAtTTT W1UJU0JG TIChGAAAGC TIG1GCCMC 
YVD DLLL V A R SET LCHQ 
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AAGOOCOCA XjAACTCTTA ACTTTQCaCA OTCCTCTCG CTACAAGGTT 
A T Q ELL TFLT TCG Y K V 

TCOVAACCAA A3GCTO0GCT CK3CXC3CAG GAGATTAGBT ACnAGOQCT 
SKPK ARL CSQ EIRY LGL 



AAAATIA 
K 



a^ATCcj AAAGGOCCA 

lsIkgtr AL 
k TEciaaccc aaaaco 

PHP K T 1 

; GrrrcnrrcD waacmrtt 

F C R k Q I 



G3XXX3CAG TSTGGAAOGET ATOOUGCCTA 
S EER I Q P I 



TAciarnx ttciooccc amaodciaa aocaactaas Aam-iu.Tr 

L AY P HP KTLK QLR GFL 
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OtGACCKTTA TATACMAA TUGOQWC TCAG\AAGOC AATAXTATT 
BPL YTLI RET Q K A MTYL 



TAGTAAGATC GACAOT3CA GWCTTOIT TOOGCOTT AAAGAAGQX 
VRW TPT EVAF Q A L K K A 



CTAACCCRAG aECMTOTT CWJJTIUOA ACKEGCRAG ATnTTCPTT 
LTQA PVF SLP TGQD FSL 



ATATOXACA GBAAAAACAG GAATAGdCT AGGPGTCCTT AGGQGGTCT 
YAT EKTG I A L GVL TQVS 

CAGGGA2GAG CTTGOMTC GKETATA0C TGAGTAAGGA AATTGATGUi 
GMS LQP VVYL SKE IDV 

GTGGCAAAGG G1TGQCOCA TTGTITATGS GTAATGQ0QS OGHGCXSt 
VAKG WPH CLW V H A A VAV 

crTPcmrcr gmvocxttta aaataataca gogmgagat cxtactgxgt 

LVS EAVK IIQ GRD LTVW 

GGAQOCTCA TG\TGIGMC GGGXIACIOV CTQCTAAAGG AGfiCnXJIGG 
TSH D V N GILT A K G DLW 

TTGTCAGACA A0CATTDCT IMTSOCRG GCICTATISC TTGAflGAGOC 
L S D N HLL NYQ ALLL SEP 



AGTOnGAGA OGOGCACTT 
V L R L R 



7*CFT GTGZAACTCT TAAICOpCC ACATI 
TC ATL KPR TF 
.BflK0£ I 



ACATTTCTrC 
L P 



CAGACAATGA AGXAAflCTTA GAZCATAACT GTCMCAAGT AATT0CTCAA 
DNE E K I EHNC QQV I A Q 

AXTATGTTC CTCGMGCGA CCSTCDQG GTiaCCITGA CKMCODGA 
TYAA RGO LLE VPLT DPD 
( ^RNtNH 

ancAAq^tG tatactgrtg GMGTrocrr ggogaaaaa ggagxtggaa 

L N L Y T DG_S SL ABK G L R K 
I 

AA U-mi»lA T0CAGTG7CC AG2GATAKEG GAATACITGA AAGTAATOQC 
AGY AVI SDNG ILE SNR 

CTCACICCAG GA\CDG35C TOCdGQCA GAACTAATAG CDCTCACTIG 
LTPG TSA HLA E L I A L T W 

OJOCrDGAA TDWQSKjAAG GAAAAAGGGT AAATATATAT 1CX3CTCEA 
ALE L G E G R R V N I Y SDSK 

A7TATGCTTA OCTAGICCTC OOXXJXB3 CKZAATATG GAGAGAGAGG 
YAT LVL HAKA A I W RBR 

GAATTOCTAA CTTCIGAGSG AACAGCOlDC AAGCATOG3 AAGOCATTAG 
EFLT SEG TPI NKQE AIR 

GAGMTATTA TEGOGIAC AGAAAOCTAA AGA0STGQCX GTCn&CJCT 
R L L LAVQ KPK EVA VLHC 

GXK3GGTCA TOtGGVtfAA GA0GAAAGQ3 AAATAGAAGG CAAT050OA 
QGH QBE EERE IE G N R Q 



OXLtfOai'lU AA9CAAAAKA AGCEGCAAOS CMACICIC CTCTDGAAAT 
ADIE AKK AAR Q D.S P L E M 

GCTT 
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A3X3ATOCAGC AQCAGGAOE AGGGIGaCXE 50 
MIQQ QDX GCP G Q A P AHA 



CATCftOCCTC ACftGAGCOOC AGGTATOCTT QC3IGAGAAQG 100 

ITL TEPQ VCL TIE GQKG 



GTOACTOIUT QCIQGACACT GGCX3GNQ0CT TCICAGICTT ALTllUCTGfT 150 
XCL LDT GGAF SVL LSC 



TOIQCIOCAG ATCTGICACT GIOOGAGQQG TOCTAQGACA 200 
PGQL SSR SVT VRGV LGQ 



GCX^AGICACT AGATACTIUT (XC^GCCACT AAG'l'lGlGAC TOQQGAACIT 250 
PVT RYFS QPL SCD WGTL 



T50CITOC3C ACAIGCTnT CIAATEA2GC CTGAAAQOOC OOCIUITC 300 
LFP HAF LIMP ESP TLL 



ITOQQrafflG ACATICEAGC AAAAGCAGC33 QCCATEATAC A3GIGAMAT 350 
LGRD I L A KAG AIIH VNI 



AGGAGAAQGA ACAACTOTIT CTTCTTOCDCT GCTIGAGGAA GGAATEAATC 400 
GEG TTVC CPL LEE GINP 



CIGAAGICOG GQCAACA3AA GGACAAEAIG GACAMCAAA GAAIGOOCJGrr 450 
EVR ATE GQYG QAK NAR 



CCIGITCAftG TEAAACTAAA QGATTOCAOC TOCTTKXXT AOCAAfcGQCA 500 
PVQV KLK DST SFPY QRQ 
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GTAOXDCTC AGADOCGAGA CCCAACAAGA ACTGCAAAAG ATiUlAAAGG 550 
YPL RPE T QQE LQK IVKD 



ACCTAAAAGC CO>AGGOCTA GTAAAACCAA QCAATAQQOC TIGCAAGACT 600 
LKA QGL VKPS NSP CKT 



CCAATITIAG GAGTAAGGAA AGCCAAGGGA CAGIGGAGGT TAGIGCAAGA 650 
PI LG VRK PNG QWRL VQE 



ACTGAGGATT ATCAATCAGG CIGTIGTIOC TCTA3AGOCA QCT3EA0CIA 700 
LRI INEA VVP LYP AVPN 



AGGCTEATAC AGIQCITICC CAAATACCAG A3GAAGCAGA GTQGTTEACA 750 
PYT VLS QIPE EAE WFT 



GIUCIGGAOC TIAAGGATCC ITl ' lTU ' lU C MUJITGTAC GTOCTCACTC 800 
VLDL KDA FFC IPVR PDS 



TCA AM ' I L TIU THGGCTHG AAGATOCTTT GAAOOCAAGG TCICAACTCA 850 
QFL FAFE DPL NPT SQLT 



OCIGGACIGT TTEAOOQCAA GGGTICAGGG ATAQ000QCA TCEATTIGGC 900 
WTV LPQ GFRD SPH LFG 



CAQQCATIAG CGCAAGACTT GAGICAATIC TCATAGCTGG ACACTCTIGT 950 
QALA QDL SQF SYLD TLV 



GCITCRGflAC ATGGATGATT TACTTTEAGT O30OCGTICA GAAACCTIGT 
LQY MDDL LL V A R S ETLC 
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GOCATCA^GC CACOIAAGAA CTCITbNJFT TGCTCACEAC CIGIQQCIftC 1050 
HQA TQE LLTF LTT C G Y 



AAGG7ITIOCA AMCAAMGC TOQQCTCIGC TCACMGAGA. TIZO¥J3*CIN 1100 
KVSK PKA RLC SQEI RYX 



AGQGCTAAAA TIAIOCAAAG QCAOCAGQGC (XHX^GTIGAG GAADGTCATOC 1150 
GLK LSKG TRA LSE E R I Q 



AGOCTftTACT GQCTEAIOCT CATOOCAAAA (XUI&AAGCA ACIAftGBGQG 1200 
PIL AYP HPKT LKQ LRG 



T?^CAGGTIT CT3G0GAAAA CftGATiaXA GCTftCASCXC 1250 
FLGI TGF CRK QIPR YXP 



AAEAGGCAGA 03VITATATA CftCTAATTMST QGAAACTCAG AAMC3CAATA 1300 
IAR PLYT LIX ETQ K AN T 



QCTATTEAGT AffiATOGftCA OCTACAGAAG TOQCITTDCA QQCXDCTAAAG 1350 
Y L V* ' R W T PTEV AF Q ALK 



AAGGC30CTAA OEAAGCCCC AGIGTCTCAGC TIGOCAACAG GGCAftGAJTT 1400 
KALT QAP V FS LPTG QDF 



TidmAimr gocacagaaa aaacaggaat agctctogga giott&cgc 1450 

SLY ATEK TGI ALG VLTQ 



AGCTCTCAGG G^AGCTIG CMOOOGIGG TATAOCTGAG T&MGAAA3T 1500 
VSG MSL QP VV YLS KEI 
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GMGEW3IOG CAAAQGGTIG GCXJECATNGT TEKIGG3IJA& TQGNQQCAGT 
DVVA KGW PHX LWVM XAV 



AQCaGICINA GTATCTGAAG CAGTTAAAAT AATACAQQGA AGAGATCTIN 
AVX VSEA VKI IQG RDLX 



CIGTOTOGfiC ATCTZATGAT GTGAACGGCA. TACTSRCIGC TAAAGGAGAC 
VWT SHD VNGI LXA KG D 



TlUiUaTlOr CAGACAACCA TITACITAAN TAYCAGQCYY TATTACITCA. 
LWLS DNH LLX YQAL LLE 



AGAGCCAGTG CIGNGACTGC GCALT1U1UC AACTCTEAAA GCCAAACTIA. 
EPV LXLR TCP TLK PKLM 



1QC7K3CDCAG AAQGATCTTT NEAGAQSICC CCITAQCCAA. C02IGACCIC 
L P R R I F X E VP LAN PDL 



AACTATATAT ATACTGATQG AAGTICGTIT GEAGAAAAGG GATTACAAAG 
NYIY TDG S SF VEKG LQR 



G3flAGGATAT NCOOAQGIG TIAGIGATAA AGCAGCACIT GAAAGTAAGC 
XGY XIGV SDK AVL ESKP 



cicnaxcc ccagggacea gogcxxxdst tagcagaact agiggcacig 

LPP OGP APPL AEL VAL 
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AGAIAGCAAG TATOCTEfiTC TAATCDGAAA IQCXXaTOTT QCAATATOGA. 
DSK YAYL IRN A H V AIWK 



AAGAAAG9GA GTIOCTftACC TCIGQQGGAA (XXXCATEAA ATAOCACAAG 
E RE FLT SGGT PIK YH K 



TEAATCAIQG AOTIM'ILCA CACAGIGCAA AAACICAAG3 AQGIGGAAGT 
LIME LLH TVQ KLKE VEV 



CTEACaCIQC CAAAGOCaTC AGAAAAGQGA AAGAGQGGAA GAGCAQCATA 
LHC QSHQ KRE RGE EQHK 



AGTG3CIACA GAGGCAAGGA AAGACTAQCA GAAAGGAAAG AGAGAAAGAG 
WLQ RQG KTSR KER EKE 



ACAGAAAGTC AGAGAGAGAG AGAGGAAGAG ACAGAGCACA AAGAGGGAGT 
TESQ RER EEE TEHK EGV 



CftGAGAGAGA GAGAGACAGA GAGTCAGAGA GAAQGAAAGA GAGAGAGGAA 
RER ERQR VRE. KER ERGR 
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SACTIGftGCC AGTCCTCATA CCTGGACA1T LTllSH^PC AGT ATfpGQ A 
GACTIGAGGC AGIOCTCATA CCTSGACA3T CriWlV! I IC MTATfjSGGA 

IC 



SRCITCAGCC AGTCCTCATA CCTGGACATT ClUUTlt}. 
r> ir mMrm affnnrmntTO rv*m*MT 



AGTA3 i 3GGA 



JlTAATT ATAGCCACCC ATTCAGAAAC CTTGTGGCA V 
ITTAATT ATAGCCACCC ATTCAGAAAC CTTGTGGCA V 
SpTAATT ATAGCCACCC ATTCAGAAAC CTTOTGGCA : 
Im^tm 1™^™^ ivrnifimn r^mrmnriir 



: ^AAGCCACCC 
rftRfYITAnT 



AAGI3CTCTT AAATTTCCT 

aag:gctctt aaaittcct 
AAc:c3CTerr aaatticct 



3CTACCTGTG GCTCCAAACA AA^SGCTCAS 
3CTACCTGTG GCTCCAAACA AAR 3GCTCA - 
3CTACCTGTG GCTCCAAACA AA 0 3GCTCA 3 
nrmnrmrrz tymnaanra aaUra^rab 



ctctcctcac aHcaggttaa ATACTTAGGG CTAAAATTAT CCAAAGTC ■ : 
CTCTOCTCAC A : CAGCTTAA ATACTTAGGG CTAAAATTAT CCAAAGTC R - 

CCAAAGTC ft Z 



CTCTOCTCAC A : CAGGTTAA ATACTTAGGG CTAAAATTAT 

TC*^ aUna/ anwraa nmnmvirm f^p&a&anfTOT ptaaaj 



CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGG » T 
CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATA CTGG I V 
CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGG : T 
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AAGCACTCTT AAATITCCTr GCTACCTGTG GCTACAAGGT TTCCAAACCA 
AAGCACTCTT AAATTTOCTT GCTACCTGTG GCTACAAGGT TTCCAAACCA 
AAGCACTCTT AAATTTCCTT GCTACCTGTG GCTACAAGGT TTCCAAACCA 
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^AAAGGCACC AGAACCCTCA GTGAGGAAOG TATCCAGCCT AIACTGGGTT 
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CAAAGGCACC AGAACCCTCA GTGAGGAAOG TATCCAGCCT ATACTGGGTT 
rn ^^nni r v* ^.aarrpn^* /ymarysaann T&Trrinrcv ATArnryragl 



ATCCICATCC CAAAACCCTA AAGCAACTAA CAGCX71'1XJCT TGGCATAACA 
ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGO GCTCCT TGGCATAACA 
ATCCICATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 
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ATTATGCCTG AAAGCCCCAC TCCCTTGTTA GQGAGAGACA TTTCAGCAAA 
ATTATGCCTG AAAGCCCCAC TCCCITCTTA GQGAGAGACA TTCTAGCAAA 
AGCAGGGGCC ATTATACACC TGAACAIAGG AAAAGGAATA CCCATTTGCT 
AGCAGGGGCC ATTATACACC TGAACAIAGG AAAAGGAATA CCCATTTGCT 
GTCCCCTGCT TGAGGAAGGA ATTAATOCTG AACTCTGGGC AATAGAAGGA 
GTCCCCTGCT TGAGGAAGGA ATTAATOCTG AAGTCTOGGC AATAGAAGGA 
CAATATGGAC AAGCAAAGAA TGCCCGTCCT GTTCAAGTTA AACTAAAGGA 
CAATATGGAC AAGCAAAGAA TGCCCGTCCT GTTCAAGTTA AACTAAAGGA 
TTCTCOCTCC TTTCCCTACC AAAGGAAGTA CCCTCTTAGA CCCGAGGOOC 
TICT G CCTCC TTTCCCTACC AAAGGAAGTA CCCTCTTAGA CCOGAGGCCC 
TACAAGGANC TCAAAAGATT GTTAAGGAGC TAAAAGCCCA AGGCCTAGTA 
TACAAGGANC TCAAAAGATT GTTAAGGAGC TAAAAGCCCA AGGCCTAGTA 
AAACCATGCA GTAGOCCCTG CAATACTOCA ATTITAGGAG TAAGGAAACC 
AAACCATOCA GXAGCCCCTG CAATACTCCA ATTTTAGGAG TAAGGAAACC 
CAACGGACAG TGGAGGZTAG TGCAAGATCT CAGGATTATT AATGAGGCZG 



CAACGGACAG TGGAGGTTAG TGCAAGATCT CAGGATTATT AATGAGGCTG 
TriTlCCTCT ATACCCAGCT GTATCTAGOC CTTATACTCT GCTTTCCCTA 



TITITUCIC T ATACCCAGCT GTATCXAGCC CTXATACTCT GCTITCCCTA 
ATACCAGAGG AAGCAGAGIG GTTXACAGTC CTGGACCTTA AGGATGOCTT 
ATACCAGAGG AAGQMSAGTG GTTTACAGTC CTGGACCTTA AGGATGCCTT 
TXTCTGCATC (XTtflft O tflC CTGACXCTCA ATTCTTGTTT GOCTT1GAAG 
TTTCTOCATC CCTO IROgIC CTOACTCTCA A1TCTTCTTT GCCTTTOAAG 
ATCCITTGAA CCCAACGTCT CAACTCACCT GGACTGTTIT AOOCCAAGOG 
ATCCTTTGAA CCCAACGTCT CAACTCACCT GGACTOTTXT ACOOCAAGGG 



TTCAGGGATA GCCCCCATCT ATETGGCCAG GCATTAGCCC M 3A CTTG AG 
TPCAGGGATA GCCCCCATCT ATTPQGCCAG GCATTAGCCC AH2AC3SS&d 650 
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XAAAGCAACT AAGAfraGTTC CTTGC3C ATft\ 
XAAAGCAACT AAGA ? 3GTTC CTTGGCATA 
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ATTCCC - 3 D T ACA3YG? AAT AGCCAG:OCA 
^TnywlkU r M^kvfi MaftT AfW RfTTTfl TTA 




^Vlff AdGGA AACTCAGAAA GCCAATACCfr EffolAGTAAG ATGGACACCU 
jfcEsGA AACTCAGAAA GCCAATAOC : feflfc EAOTAAG ATGGACACCT 
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AGTCTTCAGC TTGCCAACAG GGCAAGATTT OTCTTTATAT GCCACAGAAA 
AGXCTTCAGC TTGCCAACAG GGCAAGATTT TTCTTTATAT GCCACAGAAA 
AAACAGGAAX AGCTCTAGGA GTOCTEACGC AGGTCTCAGG GATOAGCTIG 
AAACAGGAAT AGCTCTAGGA GTCCTTACGC AGGTCTCAGG GATGAGCTIG 
CAACCCGTGG TATACCTGAG TAAGGAAATT GATGTAGTGG CAAAGGGTTG 
CAACCCGTGG TATACCTGAG TAAGGAAATT GATGTAGTGG CAAAGGGTTG 
GCCTCATTOT TTATOGGXAA TGGCGGCAGT AGCAGTCTEA GTATCTGAAG 
GCCTCATTOT TTATOGGXAA TGGCGGCAGT AGCAGTCTTA GTATCTGAAG 
aUSTEAAAAT AATACAGGGA AGAGATCTTA CTSTOTGGAC ATCTCATGAT 
CAGTTAAAAT AATACAGGGA AGAGATCTTA CTGOTTOGAC ATCTCATGAT 
GTOAACGGCA TACTCACTGC TAAAGGAGAC TK3TOGTTGT CAGACAACCA 
GTCAACGGCA TACTCACTGC TAAAGGAGAC TTCTOG1TGT CAGACAACCA 
TTTACTEAAT TATCAGGCTC TATEACTTGA AGAGCCAGTG CTGAGACTGC 
TITACTEAAT TATCAGGCTC TATEACTTGA AGAGCCAGTG CTGAGACTGC 
GCACTTCTGC AACTCXTAAA OCCGCCACAT TTCTTCCA^ 
GCACTTGTGC AACTCITAAA OCCGCCACAT TTCTTCCAGA CAATOAAGAA 
AAGATAGAAC ATAACTGTCA ACAAGTAATT GCTCAAACCT A^l^i^ 
AAGATAGAAC ATAACTGTCA ACAAGTAATT GCTCAAACCT ATQCTGCTOG 
AOQGGACCTT CTAGAGGITC CCTTGACTGA TCCCGACCTC AACTTGTATA 
AGGGGACCTT CTAGW3GTTC CCTTGACTGA TCCCGACCTC AACTTGTATA 
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ATTATOCCTC AMGCCCCAC TCCCTTGTIA OMftBACACA TTTXMJC^ 
linflGOCTG AMGOCCCAC TCCCTTOTTA OQGfiGAG**. Tm»SC~* 
MOGQQ3CC ATTATACACC TGMCftlWW CCOOTIOCT 
AGCMGGGCC ATTAIACACC TCAACATAQG AAAMWMITA CCCATTTCCT 

cnaax-xocT tgaqgarqga attaatcctg «g»GAMSA 

GTCCCCTGCT TGRGGAAQGA ATXAKTCCTO AAGICTQQGC AAXAGfcABSA 
CAATATCGAC AAQCAAAGAA TOCCOGTCCT GTTCAAGTtt 
CftftlftSGGAC WVGCAM^ 

maccciec tttccctacc aamgaagta ccctctirga ccogmkocc 
wmcciee titccctacc aamgaagta cccTcrTAGA ccaaoGcec 

•JACAAGGANC TCAAAAGATT GTPAftOtaCC TAAAAGCgCA »»»» 
XACAAGGANC TCAAAAGJOT GrTfcfiQGACC TAAAASCCCA AGGCCTBGEA 
AAACCATQCA CTEACCCCCTG CRMRCICCA MTTTRO»G TAMSAMCC 
AAACCATQCA GEftGCCCCIG CAATACTCCA ATITTAGGAG TAMGAMCC 
CRACQSWCftG TGCSAGGTTOG TGCAAGMCT CfiBGCTITOT AftTOMjQCTO 
CAACOGACAG TQGAQGTIAG TGCRM3RTCT CWSGATTRTT AKFGMGCTO 
ri ' lTlU-I Cr ATACOCAGCT GTATCTAGCC CTTOaMTCT ^^ "^ 

rrrnu-icr atacccaoct gtktctrqcc ctiatactct gctticccta 

MftCCABftGS AaBCaGMTO GOTH«3MOT CTGOTC^ AOSMSOT 
ATACCAGAGG AAGCAGAGTG GTTEACfiGTC CTQGRCCITA AGGMGCCTT 
TTTCTGCATC CCTOBUCGIC CTTOCTCTCA AWCTrGTTT SOClTBaM 
^^CATC CC^IACGTC CTCACTCTCA ATTCTTGTXT gcctttgaac 
MCCTTOAA CCCMCGTCT CAACTCMTT GmCTOTTra ^^f^ 
ATCCTITCAA CCCAACGTCT OUVCTCACCT GGACTCJITrr ACCCCAAGGS 
TXCKSGGMA GCCCCCATCT ATTTQGCCAG GCMTftGCOC Aflf 



TTCAG3GATA GCCCCCATCT ATITOGCCAG OCATTW3CCC A? 



rCTTGTCCT TC 



TjcAtHrcrrcA tacctggaca _ , 

dSBlWrcA TACCTOGACA HlCTIOTCCT TC 



Hi 



TS GATG& TTEA C 
ko GATQATRAC 

taSJS&SO&XCEAC 




ACCITOIQCC ATC*AGCCAC CLAMJf 
ACCITOIGCC ATCAAGCCAC CGAAS 



Ti-rrTiTinrrTir ^ Rr * 




TSGCTACAAG GTTICCAAAC 
TOGCTACAAG C7TITCCAAAC CAAAQGCTC V 

■ ypp^wLna^ n rr^^^TiBr ^»»^a«igk 



50 
cn 

100 

100 

150 

150 

200 

200 

250 

250 

300 

300 

350 

350 

400 

400 

450 

450 

500 

500 

550 

550 

600 

600 

650 
8 

650 

700 
58 

700 

750 
108 

750 

800 
158 

800 



%\Qx '51 A ( 



Corv*,. 



MSKV pol 

cons AEN 41.42,43 



consensus 



pcrcrarrcA CACUAdiiTA Hat 



MSKV pol 

cons AEN 41.42,43 
Consensus 



MSKV pol 

cons AEN 41,42,43 
Consensus 



MSKV pol 

cons AEN 41,42,43 
Consensus 



MSKV pol 

cons AEN 41,42,43 
Consensus 



MSKV pol 

cons AEN 41,42,43 
Consensus 



MSKV pol 

cons AEN 41,42,43 
Consensus 



MSKV pol 

cans AEN 41,42,43 
Consensus 



MSKV pol 

cans AEN 41,42,43 
Consensus 

MSKV pol 

cons AEN 41,42,43 
Consensus 

MSKV pol 

cans AEN 41,42,43 
Consensus 



MSKV pol 

cans AEN 41,42,43 
Consensus 

MSKV pol 

cans AEN 41,42,43 
Consensus 

MSKV pol 

cans AEN 41,42,43 
Consensus 

MSKV pol 

cans AEN 41,42,43 
Consensus 



ATA CTTA GG GCXAAAATTA TOGAAAGGCA 
ATACTTAGG GCTAAAATTA TCCAAAGGCA 
rk n -iu^Tft mr*lht**m H *'™' w TTnry: rar»ra&Bftim»» *mn***fw* 



m 



CCCT CAGTGAGGAA CGTATOZAGC CTATACTGGC 155555555 
SCCT CAGTGAGGAA CGTATCCAGC CTATACTGG 3 ITA3CCICA3 
TTT rtrrrr.irvzz* nry™**™^ ™™~r~l* L^mn^m™^ 



CCCAAAACCC TAAAQCAACT AA 
CCCAAAACCC TAAAQCAACT AA 



_ Tx-n GrrC CTTQGCATAA CAOGTTTCTG 

CCCAAAACCC TAAAQCAACT AA|£gH3TTC CTTGGCAIAA CAflGTTTCTC 

rrrrr fmnnn*™* wg. f j. 




AGCCAGACCA TTAIATACAC 
AGOCAGACCA TTAIATACAC 

.AGCCAGACCB^AaU&XftCAC 



TAATTA:SGA AACTCAGAAA GCCAf^ASr (ATlTAGTAAti ATOGACAt 
CAATTAR3GA AACTCABAAA GCCASTACC; AITTAGTAAG ATGGACaS 

HAA33at hfs a a arrrTiHaaa nrraUr^nrir Lvmem^n *nra»™I 




„ TSGc ri AT -uA GGCCCTAAAG ftAQQOCCTAA CCCAAGCCCC 
: GAGAAG TOGCTTTCCA GGCCCTAAAG 



AQCAGAAG-IDQQCXEDCCL. 



AGTGTTCAGC TTGCCAACAG GGCAAGATTT TTCTTTATAT GCCACAGAAA 
AGTGTTCAGC TTOCCAACAG GGCAAGATTT TTCTTTATAT GCCACAGAAA 
AAACAGGAAT AGCTCTAGGA GTCCTTACGC AGGTCTCAGG GATGAGCTTC 
AAACAGGAAT AGCICTAGGA GTCCTTACGC AGGTCTCAGG GATSAGCTTG 
CAACCOGTGG TATACCTGAG TAAGGAAATT GATGTAGTGG CAAAGGGTTG 
CAACCCGTGG TATACCTGAG TAAGGAAATT GATGTAGTOG CAAAGGGTTG 
GCCTCATTGT TTATGGGIAA TGGOGGCACT AGCAGTCTTA GIATCTBAAG 



GCCTCA3TGT TEA3G0GTAA TGGOGGCAOT AGCAGTCTTA GTATCTGAAG 
CAGTTAAAAT AATACAGGGA AGAGATCTTA CTGTGTGGAC ATCTCATGAT 



CAGTTAAAAT AATACAGGGA AGAGATCTTA CTGTGTGGAC ATCTCATGAT 
GTGAACOTCA TACTCACTGC TAAAGGAGAC T lU fl WlTlUT CAGACAACCA 
GTGAACGGCA TACTCACTGC TAAAGGAGAC lWlUUTIW CAGACAACCA 
TTTACTTAAT TATCAGGCTC TATTACTTGA AGAGCCAGTS CTGAGACTGC 
TTTACTTAAT TATCAGGCTC TATTACTTGA AGAGCCAGTS CTGAGACTGC 
GCACTTGTGC AACTCTTAAA CCOGCCACAT TTCTTCCAGA CAATGAAGAA 
GCACTTGTGC AACTCTTAAA CCOGCCACAT TTCTTCCAGA CAATGAAGAA 
AAGATAGAAC ATAACTGTCA ACAAGTAATT GCTCAAACCT ATGCTGCTCG 
AAGATAGAAC ATAACTCTCA ACAAT?t&&t*p rwiM^ » w > 



850 
208 

850 

900 
258 

900 

950 
308 

950 



1000 
358 

1000 



1050 
408 

1050 

1097 
438 



MAG KAGGCCCTAA CCCAAGCCCC 1100 



U47 
438 

1150 



1197 
436 

1200 



1247 
438 

1250 



1297 
438 

1300 



1347 
438 

1350 



1397 
438 

1400 



1447 
438 

1450 



1497 
438 

1500 



1547 
438 



MSKV pol 

cons AEN 41,42,43 
Consensus 



AGGGGACCTT CTAGAGGTTC CCTTGACTGA TCOOGACCTC AACTTCEATA 
AGGGGACCTT CTAGAGGTTC CCTTGACTSA TCCCGACCTC AACTTOTATA 



1597 
438 

1600 



Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41,42,43 

Consensus 

Trans of MSRV pol 
cons prot 41.42.43 

Consensus 



^5 



&J69 

SZ B 



IMPESFTPLL GRDILAKAGA IIHLUIGKGX PICCPLLEEG INPEVWAIBG 



QYGQAKNARP VQVKLKDSAS FFYQRKYPLR PEALQGXQKI VKDLKAQGLV 



KPCSSPCNTP ILGVRKFN3Q WRLVQDLRII NEAVFPLYPA VSSFYTLLSL 



IPEEAEWFTV LDLKDAFPCI PVRPDSQFLF AFEDPUOPTS QLTWTVLPQG 



FRDSPHLPGQ 




bQLLL 



DDLLLvfftf SE 
DDJJLL m SE 



TLCHQATQ : L 
TLCHQATQ q L 



SK TTrKQATQ 



FL I ICGYK VSKPKA : LCS C SIT VLGLKL 



EL 



LfpTf ICGYK VSKPKA JjCS QpVIfYIiGLKL 



SKGTP =1 LSEE RIQPUi 
SKGTRILSEE RIQPILf 



r.CTPP PTQPTTl 



PKTLKQIRCF 



pktlkq: 

PKTTiKQT 



►IE?F 




3TQK ANTiLVRWTF 
2TQK ANIJ: LVROTF 



ALTQAFVFSL PTGQDFSLYA TEKTGIALGV LTQVSGMSLQ 



PWYLS^^ ^AKGWPHCL WVMAAVAVLV SEAVKIIQGR CLTVWTSHUV 



50 
50 
100 
100 
150 
150 
200 
200 



250 
36 

250 



300 
86 

300 



350 
136 

350 

400 
136 

400 

450 
140 

450 



NGXI/TAKGDL WLSENHLLHY 
AF 




RLRTCATLKP ATFLPENEEK 



IEHNCQQVIA QTYAARGDLL EVPLTDPDI2T LYTOGSSLAE KGLRKAGYAV 



ISENGH-ESN RLTPGTSAHL AELIALTWAL ELGEGKKVNI YSDSKYAYLV 



LHAHAAIWRE REFLTSBGTP INHQEAXRRXj LLAVQKPKEV AVLHCQGHQE 





500 
146 

500 



550 
146 

550 



600 
146 

600 

650 
146 

650 



EEEREIEGNR QADIEAKKAA RQDSPLEMLI BSP 683 
146 

683 



cons AEN 41,42*43 
cons AEN 1,5,8 

Consensus 




AG 53 A 



AGTCiifrCATA CCTQGACAKr CTTCrtZZlTC 
AGTCiTCATA CCTQGACflir CTTGTI^IC 
rn^trav a^HvltrY^'Pft nHOraariiUr rTTTTTMTMir. 




50 
50 

50 



ill 4*5 A 4 * 

cons AEN 1,5.8 
Consensus 

cons AEN 41,42,43 
cons Am 1,5,8 

Consensus 

cons AEN 41,42,43 
cons AEN 1,5,8 

Consensus 

cons AEN 41,42,43 
cons AEN 1,5,8 

Consensus 



cons AEN 41,42,43 
cons AEN 1,5,8 

Consensus 



RT^fm^HT^nrfF^ CAAGCCACCq 

TCjiCAt bAAGCCACCCj 



ATTCAGAAAC CTICTC 




AAATTICCTr 3CTACCTCTG 
TCTT AAATTTCCIif 3CTACCTCTG 



.AAA3 



GCEACAAOGT 
GC- 



X3CXCACAAOGT 



TCTGCTCACA 
TCTGCTCACA 

jjCTQCDCACA. 



iCAGGTEAAA TACTTAGGGC TAAAATTATC 
: CAGG1TAAA TACTTAGGGC TAAAATTATC 
^>/yynn ** rmrTH**ttnn T&aa&TTOHr 



CAAAG^CtCC 
CAAAG3CP CC 



"AGkdCCCTCA GfcfcAGGAACG TATCCAGC I T ATACTQG 
AG3CCCCTCA G13AGGAACG TATCCAGC ] P ATACTCG 



ATCCICATCC CA»AACC:rA 



AAGCAACTAA 
AAGCAACTAA 

AT or4Jn&tnnn n&Lj&&rrMra aaryftarTftft 



ATCOCATCC CAflAACCyTA 



xjTTCCT TQGCATAF 
falTCCT TGGCATAf 



100 
100 

100 




150 



200 
191 

200 



250 
241 

250 

300 
291 

300 



cons ADN 41,42,43 
cons AEN 1,5,8 

Consensus 



cons AEN 41,42,43 
cons AEN 1,5,8 

Consensus 

cons AEN 41,42,43 
cons AEN 1,5,8 

Consensus 




SGACAiCTGA 
sgaca: 2TGA 



M3CAGAAGTG GCTTPCCAGG CCCTAAAG 
WCAGAAGTG GCTTTCCAGG CCCTAAAG 



438 
429 

438 



Fl<5 53 B 



cons prot 41,42,43 
cons prot 1,5,8 

Consensus 

cons prot 41,42,43 
cons prot 1,5,8 

Consensus 

cons prot 41,42,43 
cons prot 1, 5, 8 

Consensus 



DLSQSSYLEr 
DLSQSSYLEI 

nT| CyCVTT . 



KAQLCSQQVK YUGLKLSK] T 
KAQLCSQQVK YLGLKLSK^T 





50 
47 

50 



100 
97 

100 



146 
143 

146 



SUES nil c &r^ c i iriULfc 8b) 



PCT/IB 97/01482 

A. CLASSIFICATION OF SUBJECT MATTER TjIK . 7TZ~T7TyTZ ]3BS"/7I 

IPC 6 C12N15/48 C12N5/^P C12N7/02 C07K14/15 CHB/12 
C12N9/22 C12Ql/7<^ C07K16/10 G01N33/569 A6TK39/21 
A61K39/42 A61K48/00 

According to International Patent Classlficatlon(IPC) of to both national classification and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 

IPC 6 C12Q C07K C12N 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electro nrc data base consulted during the international search (name of data base and, where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category : 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


A 


EP 0 731 168 A (BIO MERIEUX) 11 September 
1996 

see the whole document 


1-35 


A 


WO 95 21256 A (BIO MERIEUX ; PERRON HERVE 
(FR); MALLET FRANCOIS (FR); MANDRAND BER) 
10 August 1995 
see the whole document 


1-35 


A 


WO 94 28138 A (UNIV LONDON ;GARS0N JEREMY 
(GB); TUKE PHILIP (GB)) 8 December 1994 
see the whole document 

-/- 


1-35 



Further documents are listed in the continuation of box C 



0 



Patent family members are listed in annex. 



0 Special categories of cited documents : 

"A" document defining the general state of the art which is not 

considered to be of particular relevance 
"E" earlier document but published on or after the International 

filing date 

"L" document which may throw doubts on priority daim(s) or 
which is cited to establish the publication date of another 
citation or other special reason (as specified) 

*0" document referring to an oral disclosure, use, exhibition or 
other means 

"P" document published prior to the international filing date but 
later than the priority date claimed 



T" later document published after the international filing date 
or priority date and not in conflict with the application but 
cited to understand the principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step when the document is taken alone 

"Y" document of particular relevance; the claimed invention 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

"&" document member of the same patent tamfty 



Date of the actual completion of theinternational search 

22 April 1998 


Date of mailing of the international search report 

08/05/1998 


Name and mailing address of the ISA 

European Patent Office, P.B. 5818 Patentlaan 2 
NL-2280 HV Rifswijk 
Teh (+31-70) 340-2040, Tx. 31 651 epo nl. 
Fax: (+31-70) 340-3016 


Authorized officer 

Hagenmaler, S 



Form PCT/ISAffilO (second sheet) (July 1992) 



page 1 of 2 



| PCT/IB 97/01482 


C.(Continuatlon) DOCUMENTS CONSIDERED TO^^f LEVANT 


Category - 


Citation of document, with indication, wl^^^propriate, of the relevant passages 


'Relevant to claim No. 


A 

P,X 
P,A 


PERRON H ET AL: "IN VITRO TRANSMISSION 

aim *M-TTf\r-iiT/*TTW i- a nrrnrtllTnnn Trni mrn 
HNU HH 1 1ULIM111 1 I VT H l\t 1 l\U V 1IMJO UULfllLU 

FROM A MULTIPLE SCLEROSIS PATIENT" 

RESEARCH IN VIROLOGY, 

vol. 143, no. 5, 1 January 1992, 

pages 337-350, XP000569296 

see the whole document 

PERRON ET AL.: "MOLECULAR IDENTIFICATION 
OF A NOVEL RETROVIRUS REPEATEDLY ISOLATED 
FROM PATIENTS WITH MULTIPLE SCLEROSIS" 
PNAS, 

vol. 94, July 1997, 

pages 7583-7588, XP002062853 

see the whole document 

WO 97 06260 A (BIO MERIEUX ; PERRON HERVE 
(FR); BESEME FREDERIC (FR); BEDIN FREDER) 
20 February 1997 
see the whole document 


1-35 
1-35 
1-35 



Form PCT71S A/210 (continuation ot second sheet) (July 1 992) 



pi Z of 2 



PCT/IB 



• 



Patent document 
cited in search report 



J^Kcation 



EP 0731168 A 



11-09-96 



U0 9521256 A 



10-08-95 



Patent family 
member(s) 



Publication 
date 



FR 2731356 

AU 5007396 

Ull -f V V w 

CA 2171242 

CZ 9603287 

U0 9628552 

JP 8322579 

NO 964760 

PL 317200 



FR 
FR 
FR 
FR 
FR 
FR 
AU 
CA 
DE 
EP 
FI 
JP 
NO 



2715936 
2715938 
2715939 
2715937 
2727428 
2728585 
1711495 
2141907 

674004 T 
0674004 A 

954699 A 
8511170 T 

953925 A 



13-09-96 

02-10-96 
Q2-Q9-97 

10-09-96 
12-03-97 
19-09-96 
10-12-96 
08-11-96 
17-03-97 



11-08-95 
11-08-95 
11-08-95 
11-08-95 
31-05-96 
28-06-.96 
21-08*95 
05-08-95 
19-09-96 
27-09-95 

03- 10-95 
26-11-96 

04- 12-95 



WO 


9428138 


A 


08-12-94 


AU 


676568 


B 


13-03-97 










AU 


6760094 


A 


20-12-94 










CA 


2163641 


A 


08-12-94 










EP 


0700441 


A 


13-03-96 










OP 


8511936 


T 


17-12-96 


wo 


9706260 


A 


20-02-97 


FR 


2737500 


A 


07-02-97 










AU 


6823296 


A 


05-03-97 










EP 


0789077 


A 


13-08-97 










NO 


971493 


A 


03-06-97 










PL 


319512 


A 


18-08-97 



Form PCT/1SA/21 0 (patent family annex) (July 1992) 



